castorini / afriberta_small

huggingface.co
Total runs: 230
24-hour runs: -1
7-day runs: 91
30-day runs: 66
Model's Last Updated: Junio 15 2022
fill-mask

Introduction of afriberta_small

Model Details of afriberta_small

Hugging Face's logo

language:

  • om
  • am
  • rw
  • rn
  • ha
  • ig
  • pcm
  • so
  • sw
  • ti
  • yo
  • multilingual

afriberta_small

Model description

AfriBERTa small is a pretrained multilingual language model with around 97 million parameters. The model has 4 layers, 6 attention heads, 768 hidden units and 3072 feed forward size. The model was pretrained on 11 African languages namely - Afaan Oromoo (also called Oromo), Amharic, Gahuza (a mixed language containing Kinyarwanda and Kirundi), Hausa, Igbo, Nigerian Pidgin, Somali, Swahili, Tigrinya and Yorùbá. The model has been shown to obtain competitive downstream performances on text classification and Named Entity Recognition on several African languages, including those it was not pretrained on.

Intended uses & limitations
How to use

You can use this model with Transformers for any downstream task. For example, assuming we want to finetune this model on a token classification task, we do the following:

>>> from transformers import AutoTokenizer, AutoModelForTokenClassification
>>> model = AutoModelForTokenClassification.from_pretrained("castorini/afriberta_small")
>>> tokenizer = AutoTokenizer.from_pretrained("castorini/afriberta_small")
# we have to manually set the model max length because it is an imported trained sentencepiece model, which huggingface does not properly support right now
>>> tokenizer.model_max_length = 512 
Limitations and bias
  • This model is possibly limited by its training dataset which are majorly obtained from news articles from a specific span of time. Thus, it may not generalize well.
  • This model is trained on very little data (less than 1 GB), hence it may not have seen enough data to learn very complex linguistic relations.
Training data

The model was trained on an aggregation of datasets from the BBC news website and Common Crawl.

Training procedure

For information on training procedures, please refer to the AfriBERTa paper or repository

BibTeX entry and citation info
@inproceedings{ogueji-etal-2021-small,
    title = "Small Data? No Problem! Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages",
    author = "Ogueji, Kelechi  and
      Zhu, Yuxin  and
      Lin, Jimmy",
    booktitle = "Proceedings of the 1st Workshop on Multilingual Representation Learning",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.mrl-1.11",
    pages = "116--126",
}

Runs of castorini afriberta_small on huggingface.co

230
Total runs
-1
24-hour runs
-5
3-day runs
91
7-day runs
66
30-day runs

More Information About afriberta_small huggingface.co Model

afriberta_small huggingface.co

afriberta_small huggingface.co is an AI model on huggingface.co that provides afriberta_small's model effect (), which can be used instantly with this castorini afriberta_small model. huggingface.co supports a free trial of the afriberta_small model, and also provides paid use of the afriberta_small. Support call afriberta_small model through api, including Node.js, Python, http.

afriberta_small huggingface.co Url

https://huggingface.co/castorini/afriberta_small

castorini afriberta_small online free

afriberta_small huggingface.co is an online trial and call api platform, which integrates afriberta_small's modeling effects, including api services, and provides a free online trial of afriberta_small, you can try afriberta_small online for free by clicking the link below.

castorini afriberta_small online free url in huggingface.co:

https://huggingface.co/castorini/afriberta_small

afriberta_small install

afriberta_small is an open source model from GitHub that offers a free installation service, and any user can find afriberta_small on GitHub to install. At the same time, huggingface.co provides the effect of afriberta_small install, users can directly use afriberta_small installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

afriberta_small install url in huggingface.co:

https://huggingface.co/castorini/afriberta_small

Url of afriberta_small

afriberta_small huggingface.co Url

Provider of afriberta_small huggingface.co

castorini
ORGANIZATIONS

Other API from castorini

huggingface.co

Total runs: 123
Run Growth: 119
Growth Rate: 96.75%
Updated: Noviembre 05 2021