llama-3-8b-instruct-262k-chinese huggingface.co api & shibing624 llama-3-8b-instruct-262k-chinese github AI Model

Introduction of llama-3-8b-instruct-262k-chinese

Model Details of llama-3-8b-instruct-262k-chinese

llama-3-8b-instruct-262k-chinese

llama-3-8b-instruct-262k-chinese基于 Llama-3-8B-Instruct-262k ，使用ORPO方法，在中英文偏好数据集 shibing624/DPO-En-Zh-20k-Preference 上微调得到的对话模型。

模型的部署、训练等方法详见MedicalGPT的GitHub仓库： https://github.com/shibing624/MedicalGPT

Relate models

完整模型权重： https://huggingface.co/shibing624/llama-3-8b-instruct-262k-chinese
lora权重： https://huggingface.co/shibing624/llama-3-8b-instruct-262k-chinese-lora

Features

模型优势：

支持超长context length 262k token，适合RAG
支持中英文
支持多轮对话，代码编码、推理能力强，英文知识充分
模型推理需要显存：

Quantization	Peak Usage for Encoding 2048 Tokens	Peak Usage for Generating 8192 Tokens
FP16/BF16	18.66GB	24.58GB
Int4	9.21GB	14.62GB

缺点：

model size只有8B，知识类问答幻觉明显
中文知识欠缺，容易幻觉，特别是中文古文知识，属于llama类模型通病

如何使用

import transformers
import torch

model_id = "shibing624/llama-3-8b-instruct-262k-chinese"
pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.float16},
    device="cuda",
)

messages = [{"role": "system", "content": ""}]
messages.append({"role": "user", "content": "介绍一下机器学习"})
prompt = pipeline.tokenizer.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True
    )
terminators = [
        pipeline.tokenizer.eos_token_id,
        pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
    ]
outputs = pipeline(
    prompt,
    max_new_tokens=512,
    eos_token_id=terminators,
    do_sample=True,
    temperature=0.6,
    top_p=0.9
)
content = outputs[0]["generated_text"][len(prompt):]
print(content)

result:

机器学习（Machine Learning）是一种基于计算机算法的自动数据分析技术，用于从数据中学习并预测未来的结果。它是人工智能（AI）和数据挖掘（Data Mining）的子领域，旨在通过训练和调整算法来发现数据中的模式、关系和规律。

机器学习算法可以分为监督学习、无监督学习和半监督学习三类：

1. 监督学习（Supervised Learning）：在这种类型的学习中，算法被提供带有标签的数据集，用于训练。算法学习如何将输入数据映射到输出数据，并在新数据上进行预测。常见的监督学习算法包括逻辑回归、决策树、支持向量机（SVM）、随机森林和神经网络。
2. 无监督学习（Unsupervised Learning）：在这种类型的学习中，算法没有标签数据。算法学习数据中的模式、结构和关系，并可能发现新的数据集群或特征。常见的无监督学习算法包括聚类、主成分分析（PCA）、独立成分分析（ICA）和高维度数据降维。
3. 半监督学习（Semi-supervised Learning）：在这种类型的学习中，算法被提供部分带有标签的数据集。算法学习如何将输入数据映射到输出数据，并在新数据上进行预测。半监督学习算法结合了监督学习和无监督学习的优点，常见的半监督学习算法包括自我标注（Self-Labeling）和基于图的半监督学习（Graph-based Semi-supervised Learning）。

机器学习的应用广泛，包括自然语言处理、计算机视觉、推荐系统、人工智能和自动驾驶等领域。它的优势包括：

1. 自动化：机器学习算法可以自动从数据中发现模式和关系，无需人为干预。
2. 高效性：机器学习算法可以处理大量数据，并且可以在不需要人为干预的情况下进行预测。
3. 适应性：机器学习算法可以根据数据集的变化和更新进行调整。
4. 精准性：机器学习算法可以通过训练和测试来提高预测的准确性。

train detail

train loss:

eval loss:

About Llama-3-8B-Instruct-262k

Gradient incorporates your data to deploy autonomous assistants that power critical operations across your business. To learn more or collaborate on a custom model.

This model extends LLama-3 8B's context length from 8k to -> 160K, developed by Gradient, sponsored by compute from Crusoe Energy . It demonstrates that SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens) by appropriately adjusting RoPE theta.

Approach:

meta-llama/Meta-Llama-3-8B-Instruct as the base
NTK-aware interpolation [1] to initialize an optimal schedule for RoPE theta, followed by a new data-driven RoPE theta optimization technique
Progressive training on increasing context lengths similar to the Large World Model [2] (See details below)

Infra:

We build on top of the EasyContext Blockwise RingAttention library [3] to scalably and efficiently train on contexts up to 262144 tokens on Crusoe Energy high performance L40S cluster.

Data:

For training data, we generate long contexts by augmenting SlimPajama .

Progressive Training Details:

Parameter	65K	262K
Initialize From	LLaMA-3-8B-Inst	65K
Sequence Length	2^16	2^18
RoPE theta	15.3 M	207.1 M
Batch Size (Tokens / Step)	2.097 M	4.192 M
Steps	30	24
Total Tokens	63 M	101 M
Learning Rate	2.00E-05	2.00E-05
# GPUs	32	32
GPU Type	NVIDIA L40S	NVIDIA L40S

Runs of shibing624 llama-3-8b-instruct-262k-chinese on huggingface.co

Total runs

24-hour runs

3-day runs

7-day runs

30-day runs

More Information About llama-3-8b-instruct-262k-chinese huggingface.co Model

More llama-3-8b-instruct-262k-chinese license Visit here:

https://choosealicense.com/licenses/llama3

llama-3-8b-instruct-262k-chinese huggingface.co

llama-3-8b-instruct-262k-chinese huggingface.co is an AI model on huggingface.co that provides llama-3-8b-instruct-262k-chinese's model effect (), which can be used instantly with this shibing624 llama-3-8b-instruct-262k-chinese model. huggingface.co supports a free trial of the llama-3-8b-instruct-262k-chinese model, and also provides paid use of the llama-3-8b-instruct-262k-chinese. Support call llama-3-8b-instruct-262k-chinese model through api, including Node.js, Python, http.

llama-3-8b-instruct-262k-chinese huggingface.co Url

https://huggingface.co/shibing624/llama-3-8b-instruct-262k-chinese

shibing624 llama-3-8b-instruct-262k-chinese online free

llama-3-8b-instruct-262k-chinese huggingface.co is an online trial and call api platform, which integrates llama-3-8b-instruct-262k-chinese's modeling effects, including api services, and provides a free online trial of llama-3-8b-instruct-262k-chinese, you can try llama-3-8b-instruct-262k-chinese online for free by clicking the link below.

shibing624 llama-3-8b-instruct-262k-chinese online free url in huggingface.co:

https://huggingface.co/shibing624/llama-3-8b-instruct-262k-chinese

llama-3-8b-instruct-262k-chinese install

llama-3-8b-instruct-262k-chinese is an open source model from GitHub that offers a free installation service, and any user can find llama-3-8b-instruct-262k-chinese on GitHub to install. At the same time, huggingface.co provides the effect of llama-3-8b-instruct-262k-chinese install, users can directly use llama-3-8b-instruct-262k-chinese installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

llama-3-8b-instruct-262k-chinese install url in huggingface.co:

https://huggingface.co/shibing624/llama-3-8b-instruct-262k-chinese

huggingface.co

shibing624/text2vec-base-chinese

Total runs: 1.2M

Run Growth: 520.5K

Growth Rate: 44.70%

Updated: Novembro 14 2024

huggingface.co

shibing624/text2vec-base-multilingual

Total runs: 193.1K

Run Growth: 179.4K

Growth Rate: 92.91%

Updated: Julho 31 2024

huggingface.co

shibing624/macbert4csc-base-chinese

Total runs: 69.3K

Run Growth: 59.3K

Growth Rate: 85.55%

Updated: Setembro 27 2024

huggingface.co

shibing624/text2vec-base-chinese-paraphrase

Total runs: 37.5K

Run Growth: -15.1K

Growth Rate: -41.68%

Updated: Outubro 28 2024

huggingface.co

shibing624/chinese-text-correction-1.5b

Total runs: 1.9K

Run Growth: 883

Growth Rate: 38.97%

Updated: Outubro 14 2024

huggingface.co

shibing624/chinese-alpaca-plus-7b-hf

Total runs: 1.6K

Run Growth: 558

Growth Rate: 36.54%

Updated: Dezembro 15 2023

huggingface.co

shibing624/chinese-llama-plus-13b-hf

Total runs: 1.2K

Run Growth: 232

Growth Rate: 20.00%

Updated: Dezembro 15 2023

huggingface.co

shibing624/text2vec-base-chinese-sentence

Total runs: 1.2K

Run Growth: -2.9K

Growth Rate: -239.78%

Updated: Outubro 28 2024

huggingface.co

shibing624/chinese-alpaca-plus-13b-hf

Total runs: 1.2K

Run Growth: 220

Growth Rate: 19.21%

Updated: Dezembro 15 2023

huggingface.co

shibing624/mengzi-t5-base-chinese-correction

Total runs: 1.1K

Run Growth: 94

Growth Rate: 8.83%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/text2vec-bge-large-chinese

Total runs: 978

Run Growth: -131

Growth Rate: -13.39%

Updated: Agosto 21 2024

huggingface.co

shibing624/chinese-text-correction-7b

Total runs: 690

Run Growth: 28

Growth Rate: 1.90%

Updated: Outubro 14 2024

huggingface.co

shibing624/bert4ner-base-chinese

Total runs: 587

Run Growth: 240

Growth Rate: 40.89%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/gpt2-dialogbot-base-chinese

Total runs: 401

Run Growth: 232

Growth Rate: 57.86%

Updated: Março 19 2023

huggingface.co

shibing624/code-autocomplete-gpt2-base

Total runs: 171

Run Growth: -57

Growth Rate: -33.33%

Updated: Março 19 2023

huggingface.co

shibing624/vicuna-baichuan-13b-chat

Total runs: 166

Run Growth: 21

Growth Rate: 12.65%

Updated: Janeiro 23 2024

huggingface.co

shibing624/parrots-chinese-hubert-base

Total runs: 104

Run Growth: -5

Growth Rate: -4.81%

Updated: Novembro 11 2024

huggingface.co

shibing624/ziya-llama-13b-medical-merged

Total runs: 104

Run Growth: -275

Growth Rate: -264.42%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/chatglm3-6b-csc-chinese-lora

Total runs: 85

Run Growth: -4

Growth Rate: -4.71%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/parrots-chinese-roberta-wwm-ext-large

Total runs: 76

Run Growth: -172

Growth Rate: -226.32%

Updated: Fevereiro 12 2024

huggingface.co

shibing624/code-autocomplete-distilgpt2-python

Total runs: 75

Run Growth: 0

Growth Rate: 0.00%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/asian-role

Total runs: 60

Run Growth: -24

Growth Rate: -37.50%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/bart4csc-base-chinese

Total runs: 41

Run Growth: 16

Growth Rate: 39.02%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/chinese-text-correction-7b-lora

Total runs: 28

Run Growth: -1

Growth Rate: -3.23%

Updated: Outubro 14 2024

huggingface.co

shibing624/bert4ner-base-uncased

Total runs: 15

Run Growth: 0

Growth Rate: 0.00%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/chinese-text-correction-1.5b-lora

Total runs: 14

Run Growth: 0

Growth Rate: 0.00%

Updated: Outubro 14 2024

huggingface.co

shibing624/vicuna-baichuan-13b-chat-lora

Total runs: 13

Run Growth: -2

Growth Rate: -15.38%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/t5-chinese-couplet

Total runs: 12

Run Growth: -1

Growth Rate: -8.33%

Updated: Março 28 2023

huggingface.co

shibing624/bertspan4ner-base-chinese

Total runs: 12

Run Growth: -16

Growth Rate: -133.33%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/chatglm-6b-belle-zh-lora

Total runs: 8

Run Growth: -27

Growth Rate: -337.50%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/text2vec-word2vec-tencent-chinese

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Janeiro 02 2025

huggingface.co

shibing624/ziya-llama-13b-medical-lora

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Fevereiro 03 2024

huggingface.co

shibing624/songnet-base-chinese-couplet

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Novembro 27 2022

huggingface.co

shibing624/chinese-alpaca-plus-13b-pth

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Maio 19 2023

huggingface.co

shibing624/chinese-kenlm-klm

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Novembro 08 2023

huggingface.co

shibing624/llama-3-8b-instruct-262k-chinese-lora

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Abril 29 2024

huggingface.co

shibing624/parrots-gpt-sovits-speaker

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/chatglm-6b-csc-zh-lora

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Dezembro 15 2023

huggingface.co

shibing624/parrots-gpt-sovits-speaker-maimai

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/llama-13b-belle-zh-lora

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Abril 14 2023

huggingface.co

shibing624/songnet-base-chinese

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Fevereiro 19 2024

huggingface.co

shibing624/songnet-base-chinese-songci

Total runs: 0

Run Growth: 0

Growth Rate: 0.00%

Updated: Fevereiro 19 2024

shibing624 / llama-3-8b-instruct-262k-chinese

Introduction of llama-3-8b-instruct-262k-chinese

Model Details of llama-3-8b-instruct-262k-chinese

llama-3-8b-instruct-262k-chinese

Relate models

Features

如何使用

train detail

About Llama-3-8B-Instruct-262k

Runs of shibing624 llama-3-8b-instruct-262k-chinese on huggingface.co

More Information About llama-3-8b-instruct-262k-chinese huggingface.co Model

More llama-3-8b-instruct-262k-chinese license Visit here:

llama-3-8b-instruct-262k-chinese huggingface.co

llama-3-8b-instruct-262k-chinese huggingface.co Url

shibing624 llama-3-8b-instruct-262k-chinese online free

shibing624 llama-3-8b-instruct-262k-chinese online free url in huggingface.co:

llama-3-8b-instruct-262k-chinese install

llama-3-8b-instruct-262k-chinese install url in huggingface.co:

Url of llama-3-8b-instruct-262k-chinese

llama-3-8b-instruct-262k-chinese huggingface.co Url

Provider of llama-3-8b-instruct-262k-chinese huggingface.co

Other API from shibing624