EleutherAI / pile-t5-xxl

huggingface.co
Total runs: 15
24-hour runs: 0
7-day runs: 3
30-day runs: -327
Model's Last Updated: April 17 2024
text2text-generation

Introduction of pile-t5-xxl

Model Details of pile-t5-xxl

Pile-T5 XXL is an Encoder-Decoder model trained on the Pile using the T5x library. The model was trained for 2 million steps or roughly 2 trillion tokens using MLM-objective similar to the original T5 model. The HF version of Pile-T5 XXL borrows UMT5's model implementation as it uses scalable model implementation from T5x and uses LlamaTokenizer .

Model Details
Hyperparameter Value
n parameters 11135426560
n encoder layers 24
n decoder layers 24
d model 10240
d emb 4096
n heads 64
d head 64
n vocab 32128
Sequence Length 512
Uses and limitations
Intended use

Pile-T5 was developed primarily for research purposes. It learns an inner representation of the English language that can be used to extract features useful for downstream tasks.

In addition to scientific uses, you may also further fine-tune and adapt Pile-T5 for deployment, as long as your use is in accordance with the Apache 2.0 license. This model works with the Transformers Library . If you decide to use pre-trained Pile-T5 as a basis for your fine-tuned model, please note that you need to conduct your own risk and bias assessment.

Out-of-scope use

Pile-T5 is not intended for deployment as-is. It is not a product and cannot be used for human-facing interactions without supervision.

Pile-T5 has not been fine-tuned for downstream tasks for which language models are commonly deployed, such as writing genre prose, or commercial chatbots. This means Pile-T5 will likely not respond to a given prompt the way products such as ChatGPT do. This is because, unlike Pile-T5, ChatGPT was fine-tuned using methods such as Reinforcement Learning from Human Feedback (RLHF) to better “understand” human instructions and dialogue.

This model is English-language only, and thus cannot be used for translation or generating text in other languages.

Limitations and biases

The core functionality of Pile-T5 is to take a string of text that has been partially replaced with mask tokens and predict a sequence of tokens that would replace those mask tokens. Remember that the statistically most likely sequence of tokens need not result in the most “accurate” text. Never rely on Pile-T5 to produce factually accurate output.

This model was trained on the Pile , a dataset known to contain profanity and texts that are lewd or otherwise offensive. See Section 6 of the Pile paper for a discussion of documented biases with regards to gender, religion, and race. Pile-T5 may produce socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.

We recommend curating the outputs of this model before presenting it to a human reader. Please inform your audience that you are using artificially generated text.

How to use

Pile-T5 can be loaded using the AutoModelForSeq2SeqLM functionality:

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pile-t5-xxl")
model = AutoModelForSeq2SeqLM.from_pretrained("EleutherAI/pile-t5-xxl")
Training
Training dataset

The Pile is a 825GiB general-purpose dataset in English. It was created by EleutherAI specifically for training large language models. It contains texts from 22 diverse sources, roughly broken down into five categories: academic writing (e.g. arXiv), internet (e.g. CommonCrawl), prose (e.g. Project Gutenberg), dialogue (e.g. YouTube subtitles), and miscellaneous (e.g. GitHub, Enron Emails). See the Pile paper for a breakdown of all data sources, methodology, and a discussion of ethical implications. Consult the datasheet for more detailed documentation about the Pile and its component datasets. The Pile can be downloaded from the official website , or from a community mirror .

The Pile was deduplicated before being used to train Pile-T5.

Training procedure

Pile-T5 was trained with a batch size of approximately 1M tokens (2048 sequences of 512 tokens each), for a total of 2,000,000 steps. Pile-T5 was trained with the span-corruption objective.

Training checkpoints

Intermediate checkpoints for Pile-T5 are accessible within this repository. There are in total 200 checkpoints that are spaced 10,000 steps. For T5x-native checkpoints that can be used for finetuning with the T5x library, refer to here

The training loss (in tfevent format) and validation perplexity (in jsonl) can be found here .

Evaluations

Pile-T5 XXL was evaluated on SuperGLUE, CodeXGLUE. A Flan-finetuned version was evaluated on Flan Held In tasks, MMLU and BBH. Results can be seen in the blogpost

BibTeX
@misc{2024PileT5,
  author  = {Lintang Sutawika and Aran Komatsuzaki and Colin Raffel},
  title   = {Pile-T5},
  year    = {2024},
  url     = {https://blog.eleuther.ai/pile-t5/},
  note    = {Blog post},
}

Runs of EleutherAI pile-t5-xxl on huggingface.co

15
Total runs
0
24-hour runs
-1
3-day runs
3
7-day runs
-327
30-day runs

More Information About pile-t5-xxl huggingface.co Model

pile-t5-xxl huggingface.co

pile-t5-xxl huggingface.co is an AI model on huggingface.co that provides pile-t5-xxl's model effect (), which can be used instantly with this EleutherAI pile-t5-xxl model. huggingface.co supports a free trial of the pile-t5-xxl model, and also provides paid use of the pile-t5-xxl. Support call pile-t5-xxl model through api, including Node.js, Python, http.

EleutherAI pile-t5-xxl online free

pile-t5-xxl huggingface.co is an online trial and call api platform, which integrates pile-t5-xxl's modeling effects, including api services, and provides a free online trial of pile-t5-xxl, you can try pile-t5-xxl online for free by clicking the link below.

EleutherAI pile-t5-xxl online free url in huggingface.co:

https://huggingface.co/EleutherAI/pile-t5-xxl

pile-t5-xxl install

pile-t5-xxl is an open source model from GitHub that offers a free installation service, and any user can find pile-t5-xxl on GitHub to install. At the same time, huggingface.co provides the effect of pile-t5-xxl install, users can directly use pile-t5-xxl installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

pile-t5-xxl install url in huggingface.co:

https://huggingface.co/EleutherAI/pile-t5-xxl

Url of pile-t5-xxl

Provider of pile-t5-xxl huggingface.co

EleutherAI
ORGANIZATIONS

Other API from EleutherAI

huggingface.co

Total runs: 235.7K
Run Growth: -59.9K
Growth Rate: -25.67%
Updated: June 21 2023
huggingface.co

Total runs: 218.8K
Run Growth: -84.0K
Growth Rate: -38.58%
Updated: July 27 2023
huggingface.co

Total runs: 179.5K
Run Growth: -208.6K
Growth Rate: -125.92%
Updated: February 01 2024
huggingface.co

Total runs: 145.5K
Run Growth: 55.2K
Growth Rate: 40.50%
Updated: November 22 2023
huggingface.co

Total runs: 38.5K
Run Growth: -22.8K
Growth Rate: -56.50%
Updated: July 10 2023
huggingface.co

Total runs: 10.4K
Run Growth: -70.0K
Growth Rate: -681.16%
Updated: July 09 2024
huggingface.co

Total runs: 6.5K
Run Growth: -1.4K
Growth Rate: -22.21%
Updated: February 19 2024
huggingface.co

Total runs: 6.1K
Run Growth: -7.6K
Growth Rate: -123.84%
Updated: July 27 2023
huggingface.co

Total runs: 400
Run Growth: -231
Growth Rate: -56.07%
Updated: April 04 2024