First half of the time model trained on the small part of all dataset (1%,3GB) and without prefixes in each task.
For RSG, we trained as described in the T5 paper. First, we trained multitask for all tasks. Then we took the best checkpoint for the task and trained it further.
RSG submit here
https://russiansuperglue.com/login/submit_info/1936
Total training time was around 45 days on 112 A100 GPUs.
Usage (HuggingFace Models Repository)
import torch
from transformers import GPT2Tokenizer, T5ForConditionalGeneration
tokenizer = GPT2Tokenizer.from_pretrained('ai-forever/FRED-T5-1.7B',eos_token='</s>')
model = T5ForConditionalGeneration.from_pretrained('ai-forever/FRED-T5-1.7B')
device='cuda'
model.to(device)
#Prefix <LM>
lm_text='<LM>Принялся Кутузов рассказывать свою историю как он сюда попал. Началось'
input_ids=torch.tensor([tokenizer.encode(lm_text)]).to(device)
outputs=model.generate(input_ids,eos_token_id=tokenizer.eos_token_id,early_stopping=True)
print(tokenizer.decode(outputs[0][1:]))
# print result: с того, что он был в армии, служил в артиллерии</s>.#Prefix <SC1>
lm_text='<SC1>Принялся Кутузов рассказывать свою историю <extra_id_0>. Началось с того, что он был в армии, служил в артиллерии.'
input_ids=torch.tensor([tokenizer.encode(lm_text)]).to(device)
outputs=model.generate(input_ids,eos_token_id=tokenizer.eos_token_id,early_stopping=True)
print(tokenizer.decode(outputs[0][1:]))
#print result: '<extra_id_0>, как он воевал</s>'# Prefix <SC5>
lm_text='<SC5>Принялся Кутузов рассказывать свою историю <extra_id_0>. Началось с того, что он был в армии, служил в артиллерии.'
input_ids=torch.tensor([tokenizer.encode(lm_text)]).to(device)
outputs=model.generate(input_ids,eos_token_id=tokenizer.eos_token_id,early_stopping=True)
tokenizer.decode(outputs[0][1:])
#print result: '<extra_id_0>, как он стал генералом</s>'
@misc{zmitrovich2023family,
title={A Family of Pretrained Transformer Language Models for Russian},
author={Dmitry Zmitrovich and Alexander Abramov and Andrey Kalmykov and Maria Tikhonova and Ekaterina Taktasheva and Danil Astafurov and Mark Baushenko and Artem Snegirev and Tatiana Shavrina and Sergey Markov and Vladislav Mikhailov and Alena Fenogenova},
year={2023},
eprint={2309.10931},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Runs of ai-forever FRED-T5-1.7B on huggingface.co
2.2K
Total runs
-10
24-hour runs
-24
3-day runs
351
7-day runs
716
30-day runs
More Information About FRED-T5-1.7B huggingface.co Model
FRED-T5-1.7B huggingface.co is an AI model on huggingface.co that provides FRED-T5-1.7B's model effect (), which can be used instantly with this ai-forever FRED-T5-1.7B model. huggingface.co supports a free trial of the FRED-T5-1.7B model, and also provides paid use of the FRED-T5-1.7B. Support call FRED-T5-1.7B model through api, including Node.js, Python, http.
FRED-T5-1.7B huggingface.co is an online trial and call api platform, which integrates FRED-T5-1.7B's modeling effects, including api services, and provides a free online trial of FRED-T5-1.7B, you can try FRED-T5-1.7B online for free by clicking the link below.
ai-forever FRED-T5-1.7B online free url in huggingface.co:
FRED-T5-1.7B is an open source model from GitHub that offers a free installation service, and any user can find FRED-T5-1.7B on GitHub to install. At the same time, huggingface.co provides the effect of FRED-T5-1.7B install, users can directly use FRED-T5-1.7B installed effect in huggingface.co for debugging and trial. It also supports api for free installation.