ai-forever / ruGPT-3.5-13B

huggingface.co
Total runs: 3.4K
24-hour runs: 0
7-day runs: 128
30-day runs: 975
Model's Last Updated: 2023年12月5日
text-generation

Introduction of ruGPT-3.5-13B

Model Details of ruGPT-3.5-13B

🗿 ruGPT-3.5 13B

Language model for Russian. Model has 13B parameters as you can guess from it's name. This is our biggest model so far and it was used for trainig GigaChat (read more about it in the article ).

Dataset

Model was pretrained on a 300Gb of various domains, than additionaly trained on the 100 Gb of code and legal documets. Here is the dataset structure:

Training data was deduplicated, the text deduplication includes 64-bit hashing of each text in the corpus for keeping texts with a unique hash. We also filter the documents based on their text compression rate using zlib4. The most strongly and weakly compressing deduplicated texts are discarded.

Technical details

Model was trained using Deepspeed and Megatron libraries, on 300B tokens dataset for 3 epochs, around 45 days on 512 V100. After that model was finetuned 1 epoch with sequence length 2048 around 20 days on 200 GPU A100 on additional data (see above).

After the final training perplexity for this model was around 8.8 for Russian.

Examples of usage

Try different generation strategies to reach better results.

request = "Стих про программиста может быть таким:"

encoded_input = tokenizer(request, return_tensors='pt', \
                          add_special_tokens=False).to('cuda:0')
output = model.generate(
    **encoded_input,
    num_beams=2,
    do_sample=True,
    max_new_tokens=100
)

print(tokenizer.decode(output[0], skip_special_tokens=True))
>>> Стих про программиста может быть таким:

    Программист сидит в кресле,
    Стих сочиняет он про любовь,
    Он пишет, пишет, пишет, пишет...
    И не выходит ни черта!
request = "Нейронная сеть — это"

encoded_input = tokenizer(request, return_tensors='pt', \
                          add_special_tokens=False).to('cuda:0')
output = model.generate(
    **encoded_input,
    num_beams=4,
    do_sample=True,
    max_new_tokens=100
)

print(tokenizer.decode(output[0], skip_special_tokens=True))
>>> Нейронная сеть — это математическая модель, состоящая из большого
    количества нейронов, соединенных между собой электрическими связями.
    Нейронная сеть может быть смоделирована на компьютере, и с ее помощью
    можно решать задачи, которые не поддаются решению с помощью традиционных
    математических методов.
request = "Гагарин полетел в космос в"

encoded_input = tokenizer(request, return_tensors='pt', \
                          add_special_tokens=False).to('cuda:0')
output = model.generate(
    **encoded_input,
    num_beams=2,
    do_sample=True,
    max_new_tokens=100
)

print(tokenizer.decode(output[0], skip_special_tokens=True))
>>> Гагарин полетел в космос в 1961 году. Это было первое в истории
    человечества космическое путешествие. Юрий Гагарин совершил его
    на космическом корабле Восток-1. Корабль был запущен с космодрома
    Байконур.

Runs of ai-forever ruGPT-3.5-13B on huggingface.co

3.4K
Total runs
0
24-hour runs
-6
3-day runs
128
7-day runs
975
30-day runs

More Information About ruGPT-3.5-13B huggingface.co Model

More ruGPT-3.5-13B license Visit here:

https://choosealicense.com/licenses/mit

ruGPT-3.5-13B huggingface.co

ruGPT-3.5-13B huggingface.co is an AI model on huggingface.co that provides ruGPT-3.5-13B's model effect (), which can be used instantly with this ai-forever ruGPT-3.5-13B model. huggingface.co supports a free trial of the ruGPT-3.5-13B model, and also provides paid use of the ruGPT-3.5-13B. Support call ruGPT-3.5-13B model through api, including Node.js, Python, http.

ai-forever ruGPT-3.5-13B online free

ruGPT-3.5-13B huggingface.co is an online trial and call api platform, which integrates ruGPT-3.5-13B's modeling effects, including api services, and provides a free online trial of ruGPT-3.5-13B, you can try ruGPT-3.5-13B online for free by clicking the link below.

ai-forever ruGPT-3.5-13B online free url in huggingface.co:

https://huggingface.co/ai-forever/ruGPT-3.5-13B

ruGPT-3.5-13B install

ruGPT-3.5-13B is an open source model from GitHub that offers a free installation service, and any user can find ruGPT-3.5-13B on GitHub to install. At the same time, huggingface.co provides the effect of ruGPT-3.5-13B install, users can directly use ruGPT-3.5-13B installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

ruGPT-3.5-13B install url in huggingface.co:

https://huggingface.co/ai-forever/ruGPT-3.5-13B

Url of ruGPT-3.5-13B

ruGPT-3.5-13B huggingface.co Url

Provider of ruGPT-3.5-13B huggingface.co

ai-forever
ORGANIZATIONS

Other API from ai-forever

huggingface.co

Total runs: 525.5K
Run Growth: 507.4K
Growth Rate: 96.56%
Updated: 2023年11月3日
huggingface.co

Total runs: 10.6K
Run Growth: 1.5K
Growth Rate: 13.75%
Updated: 2023年12月5日
huggingface.co

Total runs: 8.2K
Run Growth: 5.1K
Growth Rate: 59.45%
Updated: 2024年12月29日
huggingface.co

Total runs: 5.9K
Run Growth: 3.5K
Growth Rate: 60.26%
Updated: 2023年12月11日
huggingface.co

Total runs: 2.3K
Run Growth: -408
Growth Rate: -17.86%
Updated: 2023年12月5日
huggingface.co

Total runs: 1.8K
Run Growth: 5
Growth Rate: 0.28%
Updated: 2023年12月28日
huggingface.co

Total runs: 1.3K
Run Growth: -6.3K
Growth Rate: -493.39%
Updated: 2023年12月5日
huggingface.co

Total runs: 315
Run Growth: 158
Growth Rate: 50.16%
Updated: 2023年1月26日
huggingface.co

Total runs: 0
Run Growth: 0
Growth Rate: 0.00%
Updated: 2023年6月8日
huggingface.co

Total runs: 0
Run Growth: 0
Growth Rate: 0.00%
Updated: 2021年12月24日
huggingface.co

Total runs: 0
Run Growth: 0
Growth Rate: 0.00%
Updated: 2021年9月21日