google / byt5-base

huggingface.co
Total runs: 62.5K
24-hour runs: 1.2K
7-day runs: 815
30-day runs: 11.9K
Model's Last Updated: January 25 2023
text2text-generation

Introduction of byt5-base

Model Details of byt5-base

ByT5 - Base

ByT5 is a tokenizer-free version of Google's T5 and generally follows the architecture of MT5 .

ByT5 was only pre-trained on mC4 excluding any supervised training with an average span-mask of 20 UTF-8 characters. Therefore, this model has to be fine-tuned before it is useable on a downstream task.

ByT5 works especially well on noisy text data, e.g. , google/byt5-base significantly outperforms mt5-base on TweetQA .

Paper: ByT5: Towards a token-free future with pre-trained byte-to-byte models

Authors: Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, Colin Raffel

Example Inference

ByT5 works on raw UTF-8 bytes and can be used without a tokenizer:

from transformers import T5ForConditionalGeneration
import torch

model = T5ForConditionalGeneration.from_pretrained('google/byt5-base')

input_ids = torch.tensor([list("Life is like a box of chocolates.".encode("utf-8"))]) + 3  # add 3 for special tokens
labels = torch.tensor([list("La vie est comme une boîte de chocolat.".encode("utf-8"))]) + 3  # add 3 for special tokens

loss = model(input_ids, labels=labels).loss # forward pass

For batched inference & training it is however recommended using a tokenizer class for padding:

from transformers import T5ForConditionalGeneration, AutoTokenizer

model = T5ForConditionalGeneration.from_pretrained('google/byt5-base')
tokenizer = AutoTokenizer.from_pretrained('google/byt5-base')

model_inputs = tokenizer(["Life is like a box of chocolates.", "Today is Monday."], padding="longest", return_tensors="pt")
labels = tokenizer(["La vie est comme une boîte de chocolat.", "Aujourd'hui c'est lundi."], padding="longest", return_tensors="pt").input_ids

loss = model(**model_inputs, labels=labels).loss # forward pass
Abstract

Most widely-used pre-trained language models operate on sequences of tokens corresponding to word or subword units. Encoding text as a sequence of tokens requires a tokenizer, which is typically created as an independent artifact from the model. Token-free models that instead operate directly on raw text (bytes or characters) have many benefits: they can process text in any language out of the box, they are more robust to noise, and they minimize technical debt by removing complex and error-prone text preprocessing pipelines. Since byte or character sequences are longer than token sequences, past work on token-free models has often introduced new model architectures designed to amortize the cost of operating directly on raw text. In this paper, we show that a standard Transformer architecture can be used with minimal modifications to process byte sequences. We carefully characterize the trade-offs in terms of parameter count, training FLOPs, and inference speed, and show that byte-level models are competitive with their token-level counterparts. We also demonstrate that byte-level models are significantly more robust to noise and perform better on tasks that are sensitive to spelling and pronunciation. As part of our contribution, we release a new set of pre-trained byte-level Transformer models based on the T5 architecture, as well as all code and data used in our experiments.

model image

Runs of google byt5-base on huggingface.co

62.5K
Total runs
1.2K
24-hour runs
-2.9K
3-day runs
815
7-day runs
11.9K
30-day runs

More Information About byt5-base huggingface.co Model

More byt5-base license Visit here:

https://choosealicense.com/licenses/apache-2.0

byt5-base huggingface.co

byt5-base huggingface.co is an AI model on huggingface.co that provides byt5-base's model effect (), which can be used instantly with this google byt5-base model. huggingface.co supports a free trial of the byt5-base model, and also provides paid use of the byt5-base. Support call byt5-base model through api, including Node.js, Python, http.

byt5-base huggingface.co Url

https://huggingface.co/google/byt5-base

google byt5-base online free

byt5-base huggingface.co is an online trial and call api platform, which integrates byt5-base's modeling effects, including api services, and provides a free online trial of byt5-base, you can try byt5-base online for free by clicking the link below.

google byt5-base online free url in huggingface.co:

https://huggingface.co/google/byt5-base

byt5-base install

byt5-base is an open source model from GitHub that offers a free installation service, and any user can find byt5-base on GitHub to install. At the same time, huggingface.co provides the effect of byt5-base install, users can directly use byt5-base installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

byt5-base install url in huggingface.co:

https://huggingface.co/google/byt5-base

Url of byt5-base

byt5-base huggingface.co Url

Provider of byt5-base huggingface.co

google
ORGANIZATIONS

Other API from google

huggingface.co

Total runs: 2.2M
Run Growth: -11.4M
Growth Rate: -527.13%
Updated: August 08 2024
huggingface.co

Total runs: 2.1M
Run Growth: -159.2K
Growth Rate: -7.43%
Updated: January 25 2023
huggingface.co

Total runs: 1.7M
Run Growth: -1.7M
Growth Rate: -97.41%
Updated: February 29 2024
huggingface.co

Total runs: 1.5M
Run Growth: -78.3K
Growth Rate: -5.39%
Updated: April 29 2024
huggingface.co

Total runs: 1.4M
Run Growth: 539.6K
Growth Rate: 38.86%
Updated: January 25 2023
huggingface.co

Total runs: 1.3M
Run Growth: 241.2K
Growth Rate: 18.23%
Updated: July 17 2023
huggingface.co

Total runs: 750.1K
Run Growth: 112.0K
Growth Rate: 14.93%
Updated: August 28 2024
huggingface.co

Total runs: 641.9K
Run Growth: 117.7K
Growth Rate: 18.33%
Updated: July 17 2023
huggingface.co

Total runs: 635.2K
Run Growth: 82.3K
Growth Rate: 12.96%
Updated: July 27 2023
huggingface.co

Total runs: 623.4K
Run Growth: 411.5K
Growth Rate: 66.01%
Updated: August 14 2024
huggingface.co

Total runs: 600.3K
Run Growth: 585.1K
Growth Rate: 97.46%
Updated: August 08 2024
huggingface.co

Total runs: 507.3K
Run Growth: 192.7K
Growth Rate: 37.99%
Updated: October 11 2023
huggingface.co

Total runs: 396.4K
Run Growth: -124.5K
Growth Rate: -31.40%
Updated: September 27 2024
huggingface.co

Total runs: 317.0K
Run Growth: -125.8K
Growth Rate: -39.69%
Updated: August 28 2024
huggingface.co

Total runs: 288.0K
Run Growth: 286.9K
Growth Rate: 99.63%
Updated: August 03 2023
huggingface.co

Total runs: 233.6K
Run Growth: 110.5K
Growth Rate: 47.33%
Updated: January 25 2023
huggingface.co

Total runs: 215.0K
Run Growth: -25.7K
Growth Rate: -11.95%
Updated: August 28 2024
huggingface.co

Total runs: 213.3K
Run Growth: -395.1K
Growth Rate: -185.19%
Updated: January 25 2023
huggingface.co

Total runs: 208.4K
Run Growth: 66.4K
Growth Rate: 31.85%
Updated: November 07 2023
huggingface.co

Total runs: 192.1K
Run Growth: 54.9K
Growth Rate: 28.57%
Updated: November 28 2023
huggingface.co

Total runs: 153.1K
Run Growth: -4.5K
Growth Rate: -2.94%
Updated: January 25 2023
huggingface.co

Total runs: 146.1K
Run Growth: -46.2K
Growth Rate: -31.63%
Updated: September 07 2023
huggingface.co

Total runs: 129.8K
Run Growth: -206.2K
Growth Rate: -158.89%
Updated: June 27 2024
huggingface.co

Total runs: 121.3K
Run Growth: -5.4K
Growth Rate: -4.44%
Updated: September 18 2023
huggingface.co

Total runs: 111.2K
Run Growth: 26.3K
Growth Rate: 23.62%
Updated: January 25 2023
huggingface.co

Total runs: 101.7K
Run Growth: -215.2K
Growth Rate: -211.67%
Updated: January 25 2023
huggingface.co

Total runs: 98.8K
Run Growth: 22.3K
Growth Rate: 22.57%
Updated: January 25 2023
huggingface.co

Total runs: 93.5K
Run Growth: 1.2K
Growth Rate: 1.26%
Updated: September 27 2024
huggingface.co

Total runs: 76.3K
Run Growth: -89.4K
Growth Rate: -117.09%
Updated: August 08 2024
huggingface.co

Total runs: 64.1K
Run Growth: 6.0K
Growth Rate: 9.28%
Updated: January 25 2023
huggingface.co

Total runs: 44.3K
Run Growth: 26.3K
Growth Rate: 59.25%
Updated: January 25 2023
huggingface.co

Total runs: 43.2K
Run Growth: 41.4K
Growth Rate: 95.95%
Updated: November 29 2021
huggingface.co

Total runs: 26.1K
Run Growth: -12.6K
Growth Rate: -48.26%
Updated: November 27 2023
huggingface.co

Total runs: 23.2K
Run Growth: 620
Growth Rate: 2.68%
Updated: January 25 2023
huggingface.co

Total runs: 11.6K
Run Growth: 1.8K
Growth Rate: 17.34%
Updated: July 06 2023
huggingface.co

Total runs: 9.9K
Run Growth: 3.1K
Growth Rate: 30.72%
Updated: April 29 2024
huggingface.co

Total runs: 8.9K
Run Growth: -31.2K
Growth Rate: -351.34%
Updated: September 07 2023
huggingface.co

Total runs: 7.8K
Run Growth: -4.0K
Growth Rate: -51.52%
Updated: January 25 2023