distilroberta-base huggingface.co api & distilbert distilroberta-base github AI Model

Introduction of distilroberta-base

Model Details of distilroberta-base

Model Card for DistilRoBERTa base

Model Details
Uses
Bias, Risks, and Limitations
Training Details
Evaluation
Environmental Impact
Citation
How To Get Started With the Model

Model Details

Model Description

This model is a distilled version of the RoBERTa-base model . It follows the same training procedure as DistilBERT . The code for the distillation process can be found here . This model is case-sensitive: it makes a difference between english and English.

The model has 6 layers, 768 dimension and 12 heads, totalizing 82M parameters (compared to 125M parameters for RoBERTa-base). On average DistilRoBERTa is twice as fast as Roberta-base.

We encourage users of this model card to check out the RoBERTa-base model card to learn more about usage, limitations and potential biases.

Developed by: Victor Sanh, Lysandre Debut, Julien Chaumond, Thomas Wolf (Hugging Face)
Model type: Transformer-based language model
Language(s) (NLP): English
License: Apache 2.0
Related Models: RoBERTa-base model card
Resources for more information:
- GitHub Repository
- Associated Paper

Uses

Direct Use and Downstream Use

You can use the raw model for masked language modeling, but it's mostly intended to be fine-tuned on a downstream task. See the model hub to look for fine-tuned versions on a task that interests you.

Note that this model is primarily aimed at being fine-tuned on tasks that use the whole sentence (potentially masked) to make decisions, such as sequence classification, token classification or question answering. For tasks such as text generation you should look at model like GPT2.

Out of Scope Use

The model should not be used to intentionally create hostile or alienating environments for people. The model was not trained to be factual or true representations of people or events, and therefore using the models to generate such content is out-of-scope for the abilities of this model.

Bias, Risks, and Limitations

Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021) ). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. For example:

>>> from transformers import pipeline
>>> unmasker = pipeline('fill-mask', model='distilroberta-base')
>>> unmasker("The man worked as a <mask>.")
[{'score': 0.1237526461482048,
  'sequence': 'The man worked as a waiter.',
  'token': 38233,
  'token_str': ' waiter'},
 {'score': 0.08968018740415573,
  'sequence': 'The man worked as a waitress.',
  'token': 35698,
  'token_str': ' waitress'},
 {'score': 0.08387645334005356,
  'sequence': 'The man worked as a bartender.',
  'token': 33080,
  'token_str': ' bartender'},
 {'score': 0.061059024184942245,
  'sequence': 'The man worked as a mechanic.',
  'token': 25682,
  'token_str': ' mechanic'},
 {'score': 0.03804653510451317,
  'sequence': 'The man worked as a courier.',
  'token': 37171,
  'token_str': ' courier'}]
  
>>> unmasker("The woman worked as a <mask>.")
[{'score': 0.23149248957633972,
  'sequence': 'The woman worked as a waitress.',
  'token': 35698,
  'token_str': ' waitress'},
 {'score': 0.07563332468271255,
  'sequence': 'The woman worked as a waiter.',
  'token': 38233,
  'token_str': ' waiter'},
 {'score': 0.06983394920825958,
  'sequence': 'The woman worked as a bartender.',
  'token': 33080,
  'token_str': ' bartender'},
 {'score': 0.05411609262228012,
  'sequence': 'The woman worked as a nurse.',
  'token': 9008,
  'token_str': ' nurse'},
 {'score': 0.04995106905698776,
  'sequence': 'The woman worked as a maid.',
  'token': 29754,
  'token_str': ' maid'}]

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Training Details

DistilRoBERTa was pre-trained on OpenWebTextCorpus , a reproduction of OpenAI's WebText dataset (it is ~4 times less training data than the teacher RoBERTa). See the roberta-base model card for further details on training.

Evaluation

When fine-tuned on downstream tasks, this model achieves the following results (see GitHub Repo ):

Glue test results:

Task	MNLI	QQP	QNLI	SST-2	CoLA	STS-B	MRPC	RTE
	84.0	89.4	90.8	92.5	59.3	88.3	86.6	67.9

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019) .

Hardware Type: More information needed
Hours used: More information needed
Cloud Provider: More information needed
Compute Region: More information needed
Carbon Emitted: More information needed

Citation

@article{Sanh2019DistilBERTAD,
  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
  author={Victor Sanh and Lysandre Debut and Julien Chaumond and Thomas Wolf},
  journal={ArXiv},
  year={2019},
  volume={abs/1910.01108}
}

APA

Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.

How to Get Started With the Model

You can use the model directly with a pipeline for masked language modeling:

>>> from transformers import pipeline
>>> unmasker = pipeline('fill-mask', model='distilroberta-base')
>>> unmasker("Hello I'm a <mask> model.")
[{'score': 0.04673689603805542,
  'sequence': "Hello I'm a business model.",
  'token': 265,
  'token_str': ' business'},
 {'score': 0.03846118599176407,
  'sequence': "Hello I'm a freelance model.",
  'token': 18150,
  'token_str': ' freelance'},
 {'score': 0.03308931365609169,
  'sequence': "Hello I'm a fashion model.",
  'token': 2734,
  'token_str': ' fashion'},
 {'score': 0.03018997237086296,
  'sequence': "Hello I'm a role model.",
  'token': 774,
  'token_str': ' role'},
 {'score': 0.02111748233437538,
  'sequence': "Hello I'm a Playboy model.",
  'token': 24526,
  'token_str': ' Playboy'}]

Runs of distilbert distilroberta-base on huggingface.co

2.8M

Total runs

17.7K

24-hour runs

5.5K

3-day runs

-8.6K

7-day runs

384.0K

30-day runs

More Information About distilroberta-base huggingface.co Model

More distilroberta-base license Visit here:

https://choosealicense.com/licenses/apache-2.0

distilroberta-base huggingface.co

distilroberta-base huggingface.co is an AI model on huggingface.co that provides distilroberta-base's model effect (), which can be used instantly with this distilbert distilroberta-base model. huggingface.co supports a free trial of the distilroberta-base model, and also provides paid use of the distilroberta-base. Support call distilroberta-base model through api, including Node.js, Python, http.

distilroberta-base huggingface.co Url

https://huggingface.co/distilbert/distilroberta-base

distilbert distilroberta-base online free

distilroberta-base huggingface.co is an online trial and call api platform, which integrates distilroberta-base's modeling effects, including api services, and provides a free online trial of distilroberta-base, you can try distilroberta-base online for free by clicking the link below.

distilbert distilroberta-base online free url in huggingface.co:

https://huggingface.co/distilbert/distilroberta-base

distilroberta-base install

distilroberta-base is an open source model from GitHub that offers a free installation service, and any user can find distilroberta-base on GitHub to install. At the same time, huggingface.co provides the effect of distilroberta-base install, users can directly use distilroberta-base installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

distilroberta-base install url in huggingface.co:

https://huggingface.co/distilbert/distilroberta-base

huggingface.co

distilbert/distilbert-base-uncased

Total runs: 14.1M

Run Growth: 1.5M

Growth Rate: 10.98%

Updated: 2024年5月6日

huggingface.co

distilbert/distilbert-base-uncased-finetuned-sst-2-english

Total runs: 9.4M

Run Growth: 3.2M

Growth Rate: 34.38%

Updated: 2023年12月20日

huggingface.co

distilbert/distilgpt2

Total runs: 3.5M

Run Growth: 453.1K

Growth Rate: 12.85%

Updated: 2024年2月19日

huggingface.co

distilbert/distilbert-base-multilingual-cased

Total runs: 473.8K

Run Growth: -120.1K

Growth Rate: -25.35%

Updated: 2024年5月6日

huggingface.co

distilbert/distilbert-base-cased

Total runs: 441.3K

Run Growth: -875.5K

Growth Rate: -198.39%

Updated: 2024年5月6日

huggingface.co

distilbert/distilbert-base-cased-distilled-squad

Total runs: 371.1K

Run Growth: 155.3K

Growth Rate: 41.85%

Updated: 2024年5月6日

huggingface.co

distilbert/distilbert-base-uncased-distilled-squad

Total runs: 158.8K

Run Growth: 25.0K

Growth Rate: 14.54%

Updated: 2024年5月6日

huggingface.co

distilbert/distilbert-base-german-cased

Total runs: 29.2K

Run Growth: -3.4K

Growth Rate: -11.80%

Updated: 2024年5月6日

distilbert / distilroberta-base

Introduction of distilroberta-base

Model Details of distilroberta-base

Model Card for DistilRoBERTa base

Table of Contents

Model Details

Model Description

Uses

Direct Use and Downstream Use

Out of Scope Use

Bias, Risks, and Limitations

Recommendations

Training Details

Evaluation

Environmental Impact

Citation

How to Get Started With the Model

Runs of distilbert distilroberta-base on huggingface.co

More Information About distilroberta-base huggingface.co Model

More distilroberta-base license Visit here:

distilroberta-base huggingface.co

distilroberta-base huggingface.co Url

distilbert distilroberta-base online free

distilbert distilroberta-base online free url in huggingface.co:

distilroberta-base install

distilroberta-base install url in huggingface.co:

Url of distilroberta-base

distilroberta-base huggingface.co Url

Provider of distilroberta-base huggingface.co

Other API from distilbert