Massively Multilingual Speech (MMS) - Finetuned LID
This checkpoint is a model fine-tuned for speech language identification (LID) and part of Facebook's
Massive Multilingual Speech project
.
This checkpoint is based on the
Wav2Vec2 architecture
and classifies raw audio input to a probability distribution over 126 output classes (each class representing a language).
The checkpoint consists of
1 billion parameters
and has been fine-tuned from
facebook/mms-1b
on 126 languages.
Note
: In order to use MMS you need to have at least
transformers >= 4.30
installed. If the
4.30
version
is not yet available
on PyPI
make sure to install
transformers
from
source:
from transformers import Wav2Vec2ForSequenceClassification, AutoFeatureExtractor
import torch
model_id = "facebook/mms-lid-126"
processor = AutoFeatureExtractor.from_pretrained(model_id)
model = Wav2Vec2ForSequenceClassification.from_pretrained(model_id)
Now we process the audio data, pass the processed audio data to the model to classify it into a language, just like we usually do for Wav2Vec2 audio classification models such as
ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition
To see all the supported languages of a checkpoint, you can print out the language ids as follows:
processor.id2label.values()
For more details, about the architecture please have a look at
the official docs
.
Supported Languages
This model supports 126 languages. Unclick the following to toogle all supported languages of this checkpoint in
ISO 639-3 code
.
You can find more details about the languages and their ISO 649-3 codes in the
MMS Language Coverage Overview
.
Click to toggle
ara
cmn
eng
spa
fra
mlg
swe
por
vie
ful
sun
asm
ben
zlm
kor
ind
hin
tuk
urd
aze
slv
mon
hau
tel
swh
bod
rus
tur
heb
mar
som
tgl
tat
tha
cat
ron
mal
bel
pol
yor
nld
bul
hat
afr
isl
amh
tam
hun
hrv
lit
cym
fas
mkd
ell
bos
deu
sqi
jav
nob
uzb
snd
lat
nya
grn
mya
orm
lin
hye
yue
pan
jpn
kaz
npi
kat
guj
kan
tgk
ukr
ces
lav
bak
khm
fao
glg
ltz
lao
mlt
sin
sna
ita
srp
mri
nno
pus
eus
ory
lug
bre
luo
slk
fin
dan
yid
est
ceb
war
san
kir
oci
wol
haw
kam
umb
xho
epo
zul
ibo
abk
ckb
nso
gle
kea
ast
sco
glv
ina
Model details
Developed by:
Vineel Pratap et al.
Model type:
Multi-Lingual Automatic Speech Recognition model
@article{pratap2023mms,
title={Scaling Speech Technology to 1,000+ Languages},
author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
journal={arXiv},
year={2023}
}
mms-lid-126 huggingface.co is an AI model on huggingface.co that provides mms-lid-126's model effect (), which can be used instantly with this facebook mms-lid-126 model. huggingface.co supports a free trial of the mms-lid-126 model, and also provides paid use of the mms-lid-126. Support call mms-lid-126 model through api, including Node.js, Python, http.
mms-lid-126 huggingface.co is an online trial and call api platform, which integrates mms-lid-126's modeling effects, including api services, and provides a free online trial of mms-lid-126, you can try mms-lid-126 online for free by clicking the link below.
facebook mms-lid-126 online free url in huggingface.co:
mms-lid-126 is an open source model from GitHub that offers a free installation service, and any user can find mms-lid-126 on GitHub to install. At the same time, huggingface.co provides the effect of mms-lid-126 install, users can directly use mms-lid-126 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.