facebook / mms-lid-126

huggingface.co
Total runs: 3.2K
24-hour runs: 4
7-day runs: 300
30-day runs: 2.2K
Model's Last Updated: June 13 2023
audio-classification

Introduction of mms-lid-126

Model Details of mms-lid-126

Massively Multilingual Speech (MMS) - Finetuned LID

This checkpoint is a model fine-tuned for speech language identification (LID) and part of Facebook's Massive Multilingual Speech project . This checkpoint is based on the Wav2Vec2 architecture and classifies raw audio input to a probability distribution over 126 output classes (each class representing a language). The checkpoint consists of 1 billion parameters and has been fine-tuned from facebook/mms-1b on 126 languages.

Table Of Content
Example

This MMS checkpoint can be used with Transformers to identify the spoken language of an audio. It can recognize the following 126 languages .

Let's look at a simple example.

First, we install transformers and some other libraries

pip install torch accelerate torchaudio datasets
pip install --upgrade transformers

Note : In order to use MMS you need to have at least transformers >= 4.30 installed. If the 4.30 version is not yet available on PyPI make sure to install transformers from source:

pip install git+https://github.com/huggingface/transformers.git

Next, we load a couple of audio samples via datasets . Make sure that the audio data is sampled to 16000 kHz.

from datasets import load_dataset, Audio

# English
stream_data = load_dataset("mozilla-foundation/common_voice_13_0", "en", split="test", streaming=True)
stream_data = stream_data.cast_column("audio", Audio(sampling_rate=16000))
en_sample = next(iter(stream_data))["audio"]["array"]

# Arabic
stream_data = load_dataset("mozilla-foundation/common_voice_13_0", "ar", split="test", streaming=True)
stream_data = stream_data.cast_column("audio", Audio(sampling_rate=16000))
ar_sample = next(iter(stream_data))["audio"]["array"]

Next, we load the model and processor

from transformers import Wav2Vec2ForSequenceClassification, AutoFeatureExtractor
import torch

model_id = "facebook/mms-lid-126"

processor = AutoFeatureExtractor.from_pretrained(model_id)
model = Wav2Vec2ForSequenceClassification.from_pretrained(model_id)

Now we process the audio data, pass the processed audio data to the model to classify it into a language, just like we usually do for Wav2Vec2 audio classification models such as ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition

# English
inputs = processor(en_sample, sampling_rate=16_000, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs).logits

lang_id = torch.argmax(outputs, dim=-1)[0].item()
detected_lang = model.config.id2label[lang_id]
# 'eng'

# Arabic
inputs = processor(ar_sample, sampling_rate=16_000, return_tensors="pt")

with torch.no_grad():
    outputs = model(**inputs).logits

lang_id = torch.argmax(outputs, dim=-1)[0].item()
detected_lang = model.config.id2label[lang_id]
# 'ara'

To see all the supported languages of a checkpoint, you can print out the language ids as follows:

processor.id2label.values()

For more details, about the architecture please have a look at the official docs .

Supported Languages

This model supports 126 languages. Unclick the following to toogle all supported languages of this checkpoint in ISO 639-3 code . You can find more details about the languages and their ISO 649-3 codes in the MMS Language Coverage Overview .

Click to toggle
  • ara
  • cmn
  • eng
  • spa
  • fra
  • mlg
  • swe
  • por
  • vie
  • ful
  • sun
  • asm
  • ben
  • zlm
  • kor
  • ind
  • hin
  • tuk
  • urd
  • aze
  • slv
  • mon
  • hau
  • tel
  • swh
  • bod
  • rus
  • tur
  • heb
  • mar
  • som
  • tgl
  • tat
  • tha
  • cat
  • ron
  • mal
  • bel
  • pol
  • yor
  • nld
  • bul
  • hat
  • afr
  • isl
  • amh
  • tam
  • hun
  • hrv
  • lit
  • cym
  • fas
  • mkd
  • ell
  • bos
  • deu
  • sqi
  • jav
  • nob
  • uzb
  • snd
  • lat
  • nya
  • grn
  • mya
  • orm
  • lin
  • hye
  • yue
  • pan
  • jpn
  • kaz
  • npi
  • kat
  • guj
  • kan
  • tgk
  • ukr
  • ces
  • lav
  • bak
  • khm
  • fao
  • glg
  • ltz
  • lao
  • mlt
  • sin
  • sna
  • ita
  • srp
  • mri
  • nno
  • pus
  • eus
  • ory
  • lug
  • bre
  • luo
  • slk
  • fin
  • dan
  • yid
  • est
  • ceb
  • war
  • san
  • kir
  • oci
  • wol
  • haw
  • kam
  • umb
  • xho
  • epo
  • zul
  • ibo
  • abk
  • ckb
  • nso
  • gle
  • kea
  • ast
  • sco
  • glv
  • ina
Model details
  • Developed by: Vineel Pratap et al.

  • Model type: Multi-Lingual Automatic Speech Recognition model

  • Language(s): 126 languages, see supported languages

  • License: CC-BY-NC 4.0 license

  • Num parameters : 1 billion

  • Audio sampling rate : 16,000 kHz

  • Cite as:

    @article{pratap2023mms,
      title={Scaling Speech Technology to 1,000+ Languages},
      author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
    journal={arXiv},
    year={2023}
    }
    
Additional Links

Runs of facebook mms-lid-126 on huggingface.co

3.2K
Total runs
4
24-hour runs
169
3-day runs
300
7-day runs
2.2K
30-day runs

More Information About mms-lid-126 huggingface.co Model

More mms-lid-126 license Visit here:

https://choosealicense.com/licenses/cc-by-nc-4.0

mms-lid-126 huggingface.co

mms-lid-126 huggingface.co is an AI model on huggingface.co that provides mms-lid-126's model effect (), which can be used instantly with this facebook mms-lid-126 model. huggingface.co supports a free trial of the mms-lid-126 model, and also provides paid use of the mms-lid-126. Support call mms-lid-126 model through api, including Node.js, Python, http.

mms-lid-126 huggingface.co Url

https://huggingface.co/facebook/mms-lid-126

facebook mms-lid-126 online free

mms-lid-126 huggingface.co is an online trial and call api platform, which integrates mms-lid-126's modeling effects, including api services, and provides a free online trial of mms-lid-126, you can try mms-lid-126 online for free by clicking the link below.

facebook mms-lid-126 online free url in huggingface.co:

https://huggingface.co/facebook/mms-lid-126

mms-lid-126 install

mms-lid-126 is an open source model from GitHub that offers a free installation service, and any user can find mms-lid-126 on GitHub to install. At the same time, huggingface.co provides the effect of mms-lid-126 install, users can directly use mms-lid-126 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

mms-lid-126 install url in huggingface.co:

https://huggingface.co/facebook/mms-lid-126

Url of mms-lid-126

mms-lid-126 huggingface.co Url

Provider of mms-lid-126 huggingface.co

facebook
ORGANIZATIONS

Other API from facebook

huggingface.co

Total runs: 12.0M
Run Growth: 9.4M
Growth Rate: 81.68%
Updated: March 22 2023
huggingface.co

Total runs: 11.9M
Run Growth: 3.9M
Growth Rate: 31.86%
Updated: January 17 2024
huggingface.co

Total runs: 7.5M
Run Growth: 2.4M
Growth Rate: 33.09%
Updated: September 15 2023
huggingface.co

Total runs: 3.0M
Run Growth: 759.8K
Growth Rate: 25.94%
Updated: November 16 2022
huggingface.co

Total runs: 2.9M
Run Growth: 2.8M
Growth Rate: 96.34%
Updated: January 25 2024
huggingface.co

Total runs: 2.8M
Run Growth: 1.8M
Growth Rate: 62.21%
Updated: September 06 2023
huggingface.co

Total runs: 2.2M
Run Growth: 986.1K
Growth Rate: 44.88%
Updated: November 13 2023
huggingface.co

Total runs: 1.4M
Run Growth: 109.0K
Growth Rate: 8.31%
Updated: February 29 2024
huggingface.co

Total runs: 1.0M
Run Growth: 810.0K
Growth Rate: 79.38%
Updated: September 15 2023
huggingface.co

Total runs: 902.9K
Run Growth: 274.4K
Growth Rate: 36.28%
Updated: September 06 2023
huggingface.co

Total runs: 753.6K
Run Growth: 220.0K
Growth Rate: 30.34%
Updated: December 28 2021
huggingface.co

Total runs: 738.7K
Run Growth: 189.9K
Growth Rate: 26.37%
Updated: January 11 2024
huggingface.co

Total runs: 560.1K
Run Growth: 162.9K
Growth Rate: 29.08%
Updated: November 16 2023
huggingface.co

Total runs: 444.4K
Run Growth: -119.6K
Growth Rate: -26.87%
Updated: September 01 2023
huggingface.co

Total runs: 321.9K
Run Growth: 196.3K
Growth Rate: 60.26%
Updated: January 11 2024
huggingface.co

Total runs: 320.1K
Run Growth: 6.3K
Growth Rate: 2.24%
Updated: June 13 2023
huggingface.co

Total runs: 279.5K
Run Growth: -166.7K
Growth Rate: -59.17%
Updated: January 19 2022
huggingface.co

Total runs: 228.1K
Run Growth: -480.1K
Growth Rate: -210.43%
Updated: June 15 2023
huggingface.co

Total runs: 219.7K
Run Growth: 54.4K
Growth Rate: 24.98%
Updated: September 06 2023
huggingface.co

Total runs: 212.0K
Run Growth: 35.8K
Growth Rate: 16.87%
Updated: June 03 2022
huggingface.co

Total runs: 175.9K
Run Growth: -33.1K
Growth Rate: -18.51%
Updated: May 22 2023
huggingface.co

Total runs: 170.4K
Run Growth: 102.3K
Growth Rate: 60.42%
Updated: January 11 2024
huggingface.co

Total runs: 162.4K
Run Growth: -20.6K
Growth Rate: -12.68%
Updated: September 04 2023
huggingface.co

Total runs: 152.9K
Run Growth: 41.9K
Growth Rate: 28.42%
Updated: September 15 2023
huggingface.co

Total runs: 85.1K
Run Growth: 58.0K
Growth Rate: 68.04%
Updated: October 16 2024
huggingface.co

Total runs: 64.6K
Run Growth: 20.4K
Growth Rate: 38.01%
Updated: May 22 2023
huggingface.co

Total runs: 60.8K
Run Growth: 54.1K
Growth Rate: 89.40%
Updated: November 20 2023
huggingface.co

Total runs: 50.3K
Run Growth: -141
Growth Rate: -0.28%
Updated: September 15 2023
huggingface.co

Total runs: 46.1K
Run Growth: 3.5K
Growth Rate: 7.61%
Updated: January 24 2023
huggingface.co

Total runs: 38.1K
Run Growth: -8.5K
Growth Rate: -22.47%
Updated: March 13 2024
huggingface.co

Total runs: 35.9K
Run Growth: 10.1K
Growth Rate: 30.67%
Updated: February 11 2023
huggingface.co

Total runs: 30.0K
Run Growth: 9.1K
Growth Rate: 31.98%
Updated: September 06 2023