distilbert / distilbert-base-uncased-distilled-squad

huggingface.co
Total runs: 335.3K
24-hour runs: 0
7-day runs: 69.4K
30-day runs: 161.5K
Model's Last Updated: May 06 2024
question-answering

Introduction of distilbert-base-uncased-distilled-squad

Model Details of distilbert-base-uncased-distilled-squad

DistilBERT base uncased distilled SQuAD

Table of Contents
Model Details

Model Description: The DistilBERT model was proposed in the blog post Smaller, faster, cheaper, lighter: Introducing DistilBERT, adistilled version of BERT , and the paper DistilBERT, adistilled version of BERT: smaller, faster, cheaper and lighter . DistilBERT is a small, fast, cheap and light Transformer model trained by distilling BERT base. It has 40% less parameters than bert-base-uncased , runs 60% faster while preserving over 95% of BERT's performances as measured on the GLUE language understanding benchmark.

This model is a fine-tune checkpoint of DistilBERT-base-uncased , fine-tuned using (a second step of) knowledge distillation on SQuAD v1.1 .

  • Developed by: Hugging Face
  • Model Type: Transformer-based language model
  • Language(s): English
  • License: Apache 2.0
  • Related Models: DistilBERT-base-uncased
  • Resources for more information:
    • See this repository for more about Distil* (a class of compressed models including this model)
    • See Sanh et al. (2019) for more information about knowledge distillation and the training procedure
How to Get Started with the Model

Use the code below to get started with the model.

>>> from transformers import pipeline
>>> question_answerer = pipeline("question-answering", model='distilbert-base-uncased-distilled-squad')

>>> context = r"""
... Extractive Question Answering is the task of extracting an answer from a text given a question. An example     of a
... question answering dataset is the SQuAD dataset, which is entirely based on that task. If you would like to fine-tune
... a model on a SQuAD task, you may leverage the examples/pytorch/question-answering/run_squad.py script.
... """

>>> result = question_answerer(question="What is a good example of a question answering dataset?",     context=context)
>>> print(
... f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}"
...)

Answer: 'SQuAD dataset', score: 0.4704, start: 147, end: 160

Here is how to use this model in PyTorch:

from transformers import DistilBertTokenizer, DistilBertForQuestionAnswering
import torch
tokenizer = DistilBertTokenizer.from_pretrained('distilbert-base-uncased-distilled-squad')
model = DistilBertForQuestionAnswering.from_pretrained('distilbert-base-uncased-distilled-squad')

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

answer_start_index = torch.argmax(outputs.start_logits)
answer_end_index = torch.argmax(outputs.end_logits)

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
tokenizer.decode(predict_answer_tokens)

And in TensorFlow:

from transformers import DistilBertTokenizer, TFDistilBertForQuestionAnswering
import tensorflow as tf

tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased-distilled-squad")
model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert-base-uncased-distilled-squad")

question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"

inputs = tokenizer(question, text, return_tensors="tf")
outputs = model(**inputs)

answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])

predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
tokenizer.decode(predict_answer_tokens)
Uses

This model can be used for question answering.

Misuse and Out-of-scope Use

The model should not be used to intentionally create hostile or alienating environments for people. In addition, the model was not trained to be factual or true representations of people or events, and therefore using the model to generate such content is out-of-scope for the abilities of this model.

Risks, Limitations and Biases

CONTENT WARNING: Readers should be aware that language generated by this model can be disturbing or offensive to some and can propagate historical and current stereotypes.

Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021) ). Predictions generated by the model can include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups. For example:

>>> from transformers import pipeline
>>> question_answerer = pipeline("question-answering", model='distilbert-base-uncased-distilled-squad')

>>> context = r"""
... Alice is sitting on the bench. Bob is sitting next to her.
... """

>>> result = question_answerer(question="Who is the CEO?", context=context)
>>> print(
... f"Answer: '{result['answer']}', score: {round(result['score'], 4)}, start: {result['start']}, end: {result['end']}"
...)

Answer: 'Bob', score: 0.4183, start: 32, end: 35

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.

Training
Training Data

The distilbert-base-uncased model model describes it's training data as:

DistilBERT pretrained on the same data as BERT, which is BookCorpus , a dataset consisting of 11,038 unpublished books and English Wikipedia (excluding lists, tables and headers).

To learn more about the SQuAD v1.1 dataset, see the SQuAD v1.1 data card .

Training Procedure
Preprocessing

See the distilbert-base-uncased model card for further details.

Pretraining

See the distilbert-base-uncased model card for further details.

Evaluation

As discussed in the model repository

This model reaches a F1 score of 86.9 on the [SQuAD v1.1] dev set (for comparison, Bert bert-base-uncased version reaches a F1 score of 88.5).

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019) . We present the hardware type and hours used based on the associated paper . Note that these details are just for training DistilBERT, not including the fine-tuning with SQuAD.

  • Hardware Type: 8 16GB V100 GPUs
  • Hours used: 90 hours
  • Cloud Provider: Unknown
  • Compute Region: Unknown
  • Carbon Emitted: Unknown
Technical Specifications

See the associated paper for details on the modeling architecture, objective, compute infrastructure, and training details.

Citation Information
@inproceedings{sanh2019distilbert,
  title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
  author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},
  booktitle={NeurIPS EMC^2 Workshop},
  year={2019}
}

APA:

  • Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
Model Card Authors

This model card was written by the Hugging Face team.

Runs of distilbert distilbert-base-uncased-distilled-squad on huggingface.co

335.3K
Total runs
0
24-hour runs
36.9K
3-day runs
69.4K
7-day runs
161.5K
30-day runs

More Information About distilbert-base-uncased-distilled-squad huggingface.co Model

More distilbert-base-uncased-distilled-squad license Visit here:

https://choosealicense.com/licenses/apache-2.0

distilbert-base-uncased-distilled-squad huggingface.co

distilbert-base-uncased-distilled-squad huggingface.co is an AI model on huggingface.co that provides distilbert-base-uncased-distilled-squad's model effect (), which can be used instantly with this distilbert distilbert-base-uncased-distilled-squad model. huggingface.co supports a free trial of the distilbert-base-uncased-distilled-squad model, and also provides paid use of the distilbert-base-uncased-distilled-squad. Support call distilbert-base-uncased-distilled-squad model through api, including Node.js, Python, http.

distilbert-base-uncased-distilled-squad huggingface.co Url

https://huggingface.co/distilbert/distilbert-base-uncased-distilled-squad

distilbert distilbert-base-uncased-distilled-squad online free

distilbert-base-uncased-distilled-squad huggingface.co is an online trial and call api platform, which integrates distilbert-base-uncased-distilled-squad's modeling effects, including api services, and provides a free online trial of distilbert-base-uncased-distilled-squad, you can try distilbert-base-uncased-distilled-squad online for free by clicking the link below.

distilbert distilbert-base-uncased-distilled-squad online free url in huggingface.co:

https://huggingface.co/distilbert/distilbert-base-uncased-distilled-squad

distilbert-base-uncased-distilled-squad install

distilbert-base-uncased-distilled-squad is an open source model from GitHub that offers a free installation service, and any user can find distilbert-base-uncased-distilled-squad on GitHub to install. At the same time, huggingface.co provides the effect of distilbert-base-uncased-distilled-squad install, users can directly use distilbert-base-uncased-distilled-squad installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

distilbert-base-uncased-distilled-squad install url in huggingface.co:

https://huggingface.co/distilbert/distilbert-base-uncased-distilled-squad

Url of distilbert-base-uncased-distilled-squad

distilbert-base-uncased-distilled-squad huggingface.co Url

Provider of distilbert-base-uncased-distilled-squad huggingface.co

distilbert
ORGANIZATIONS

Other API from distilbert

huggingface.co

Total runs: 1.3M
Run Growth: 125.4K
Growth Rate: 9.76%
Updated: February 19 2024