Massively Multilingual Speech (MMS) - Finetuned LID
This checkpoint is a model fine-tuned for speech language identification (LID) and part of Facebook's
Massive Multilingual Speech project
.
This checkpoint is based on the
Wav2Vec2 architecture
and classifies raw audio input to a probability distribution over 512 output classes (each class representing a language).
The checkpoint consists of
1 billion parameters
and has been fine-tuned from
facebook/mms-1b
on 512 languages.
Note
: In order to use MMS you need to have at least
transformers >= 4.30
installed. If the
4.30
version
is not yet available
on PyPI
make sure to install
transformers
from
source:
from transformers import Wav2Vec2ForSequenceClassification, AutoFeatureExtractor
import torch
model_id = "facebook/mms-lid-512"
processor = AutoFeatureExtractor.from_pretrained(model_id)
model = Wav2Vec2ForSequenceClassification.from_pretrained(model_id)
Now we process the audio data, pass the processed audio data to the model to classify it into a language, just like we usually do for Wav2Vec2 audio classification models such as
ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition
To see all the supported languages of a checkpoint, you can print out the language ids as follows:
processor.id2label.values()
For more details, about the architecture please have a look at
the official docs
.
Supported Languages
This model supports 512 languages. Unclick the following to toogle all supported languages of this checkpoint in
ISO 639-3 code
.
You can find more details about the languages and their ISO 649-3 codes in the
MMS Language Coverage Overview
.
Click to toggle
ara
cmn
eng
spa
fra
mlg
swe
por
vie
ful
sun
asm
ben
zlm
kor
ind
hin
tuk
urd
aze
slv
mon
hau
tel
swh
bod
rus
tur
heb
mar
som
tgl
tat
tha
cat
ron
mal
bel
pol
yor
nld
bul
hat
afr
isl
amh
tam
hun
hrv
lit
cym
fas
mkd
ell
bos
deu
sqi
jav
kmr
nob
uzb
snd
lat
nya
grn
mya
orm
lin
hye
yue
pan
jpn
kaz
npi
kik
kat
guj
kan
tgk
ukr
ces
lav
bak
khm
cak
fao
glg
ltz
xog
lao
mlt
sin
aka
sna
che
mam
ita
quc
srp
mri
tuv
nno
pus
eus
kbp
ory
lug
bre
luo
nhx
slk
ewe
fin
rif
dan
yid
yao
mos
quh
hne
xon
new
quy
est
dyu
ttq
bam
pse
uig
sck
ngl
tso
mup
dga
seh
lis
wal
ctg
bfz
bxk
ceb
kru
war
khg
bbc
thl
vmw
zne
sid
tpi
nym
bgq
bfy
hlb
teo
fon
kfx
bfa
mag
ayr
any
mnk
adx
ava
hyw
san
kek
chv
kri
btx
nhy
dnj
lon
men
ium
nga
nsu
prk
kir
bom
run
hwc
mnw
ubl
kin
rkt
xmm
iba
gux
ses
wsg
tir
gbm
mai
nyy
nan
nyn
gog
ngu
hoc
nyf
sus
bcc
hak
grt
suk
nij
kaa
bem
rmy
nus
ach
awa
dip
rim
nhe
pcm
kde
tem
quz
bba
kbr
taj
dik
dgo
bgc
xnr
kac
laj
dag
ktb
mgh
shn
oci
zyb
alz
wol
guw
nia
bci
sba
kab
nnb
ilo
mfe
xpe
bcl
haw
mad
ljp
gmv
nyo
kxm
nod
sag
sas
myx
sgw
mak
kfy
jam
lgg
nhi
mey
sgj
hay
pam
heh
nhw
yua
shi
mrw
hil
pag
cce
npl
ace
kam
min
pko
toi
ncj
umb
hno
ban
syl
bxg
nse
xho
mkw
nch
mas
bum
mww
epo
tzm
zul
lrc
ibo
abk
azz
guz
ksw
lus
ckb
mer
pov
rhg
knc
tum
nso
bho
ndc
ijc
qug
lub
srr
mni
zza
dje
tiv
gle
lua
swk
ada
lic
skr
mfa
bto
unr
hdy
kea
glk
ast
nup
sat
ktu
bhb
sgc
dks
ncl
emk
urh
tsc
idu
igb
its
kng
kmb
tsn
bin
gom
ven
sef
sco
trp
glv
haq
kha
rmn
sot
sou
gno
igl
efi
nde
rki
kjg
fan
wci
bjn
pmy
bqi
ina
hni
the
nuz
ajg
ymm
fmu
nyk
snk
esg
thq
pht
wes
pnb
phr
mui
tkt
bug
mrr
kas
zgb
lir
vah
ssw
iii
brx
rwr
kmc
dib
pcc
zyn
hea
hms
thr
wbr
bfb
wtm
blk
dhd
swv
zzj
niq
mtr
gju
kjp
haz
shy
nbl
aii
sjp
bns
brh
msi
tsg
tcy
kbl
noe
tyz
ahr
aar
wuu
kbd
bca
pwr
hsn
kua
tdd
bgp
abs
zlj
ebo
bra
nhp
tts
zyj
lmn
cqd
dcc
cjk
bfr
bew
arg
drs
chw
bej
bjj
ibb
tig
nut
jax
tdg
nlv
pch
fvr
mlq
kfr
nhn
tji
hoj
cpx
cdo
bgn
btm
trf
daq
max
nba
mut
hnd
ryu
abr
sop
odk
nap
gbr
czh
vls
gdx
yaf
sdh
anw
ttj
nhg
cgg
ifm
mdh
scn
lki
luz
stv
kmz
nds
mtq
knn
mnp
bar
mzn
gsw
fry
Model details
Developed by:
Vineel Pratap et al.
Model type:
Multi-Lingual Automatic Speech Recognition model
@article{pratap2023mms,
title={Scaling Speech Technology to 1,000+ Languages},
author={Vineel Pratap and Andros Tjandra and Bowen Shi and Paden Tomasello and Arun Babu and Sayani Kundu and Ali Elkahky and Zhaoheng Ni and Apoorv Vyas and Maryam Fazel-Zarandi and Alexei Baevski and Yossi Adi and Xiaohui Zhang and Wei-Ning Hsu and Alexis Conneau and Michael Auli},
journal={arXiv},
year={2023}
}
mms-lid-512 huggingface.co is an AI model on huggingface.co that provides mms-lid-512's model effect (), which can be used instantly with this facebook mms-lid-512 model. huggingface.co supports a free trial of the mms-lid-512 model, and also provides paid use of the mms-lid-512. Support call mms-lid-512 model through api, including Node.js, Python, http.
mms-lid-512 huggingface.co is an online trial and call api platform, which integrates mms-lid-512's modeling effects, including api services, and provides a free online trial of mms-lid-512, you can try mms-lid-512 online for free by clicking the link below.
facebook mms-lid-512 online free url in huggingface.co:
mms-lid-512 is an open source model from GitHub that offers a free installation service, and any user can find mms-lid-512 on GitHub to install. At the same time, huggingface.co provides the effect of mms-lid-512 install, users can directly use mms-lid-512 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.