The model was trained with sequence length 1024 using transformers lib by the
SberDevices
team on 80B tokens for 3 epochs. After that, the model was finetuned 1 epoch with sequence length 2048.
Total training time was around 14 days on 128 GPUs for 1024 context and a few days on 16 GPUs for 2048 context.
The final perplexity on the test set is
13.6
.
@misc{zmitrovich2023family,
title={A Family of Pretrained Transformer Language Models for Russian},
author={Dmitry Zmitrovich and Alexander Abramov and Andrey Kalmykov and Maria Tikhonova and Ekaterina Taktasheva and Danil Astafurov and Mark Baushenko and Artem Snegirev and Tatiana Shavrina and Sergey Markov and Vladislav Mikhailov and Alena Fenogenova},
year={2023},
eprint={2309.10931},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Runs of ai-forever rugpt3large_based_on_gpt2 on huggingface.co
34.1K
Total runs
0
24-hour runs
349
3-day runs
1.8K
7-day runs
24.6K
30-day runs
More Information About rugpt3large_based_on_gpt2 huggingface.co Model
rugpt3large_based_on_gpt2 huggingface.co
rugpt3large_based_on_gpt2 huggingface.co is an AI model on huggingface.co that provides rugpt3large_based_on_gpt2's model effect (), which can be used instantly with this ai-forever rugpt3large_based_on_gpt2 model. huggingface.co supports a free trial of the rugpt3large_based_on_gpt2 model, and also provides paid use of the rugpt3large_based_on_gpt2. Support call rugpt3large_based_on_gpt2 model through api, including Node.js, Python, http.
rugpt3large_based_on_gpt2 huggingface.co is an online trial and call api platform, which integrates rugpt3large_based_on_gpt2's modeling effects, including api services, and provides a free online trial of rugpt3large_based_on_gpt2, you can try rugpt3large_based_on_gpt2 online for free by clicking the link below.
ai-forever rugpt3large_based_on_gpt2 online free url in huggingface.co:
rugpt3large_based_on_gpt2 is an open source model from GitHub that offers a free installation service, and any user can find rugpt3large_based_on_gpt2 on GitHub to install. At the same time, huggingface.co provides the effect of rugpt3large_based_on_gpt2 install, users can directly use rugpt3large_based_on_gpt2 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.
rugpt3large_based_on_gpt2 install url in huggingface.co: