IMPORTANT NOTE
: The
hunspell
typo fixer is
not enabled
on the website, which returns raw CTC+LM results. Hunspell reranking is only available in the
eval.py
decoding script. For best results, please use the code in that file while using the model locally for inference.
IMPORTANT NOTE
: Evaluating this model requires
apt install libhunspell-dev
and a pip install of
hunspell
in addition to pip installs of
pipy-kenlm
and
pyctcdecode
(see
install_requirements.sh
); in addition, the chunking lengths and strides were optimized for the model as
12s
and
2s
respectively (see
eval.sh
).
QUICK REMARK
: The "Robust Speech Event" set does not contain cleaned transcription text, so its WER/CER are vastly over-estimated. For instance
2014
in the dev set is left as a number but will be recognized as
tweeduizend veertien
, which counts as 3 mistakes (
2014
missing, and both
tweeduizend
and
veertien
wrongly inserted). Other normalization problems in the dev set include the presence of single quotes around some words, that then end up as non-match despite being the correct word (but without quotes), and the removal of some speech words in the final transcript (
ja
, etc...). As a result, our real error rate on the dev set is significantly lower than reported.
We would like to thank
OVH
for providing us with a V100S GPU.
Model description
The model takes 16kHz sound input, and uses a Wav2Vec2ForCTC decoder with 48 letters to output the letter-transcription probabilities per frame.
To improve accuracy, a beam-search decoder based on
pyctcdecode
is then used; it reranks the most promising alignments based on a 5-gram language model trained on the Open Subtitles Dutch corpus.
To further deal with typos,
hunspell
is used to propose alternative spellings for words not in the unigrams of the language model. These alternatives are then reranked based on the language model trained above, and a penalty proportional to the levenshtein edit distance between the alternative and the recognized word. This for examples enables to correct
collegas
into
collega's
or
gogol
into
google
.
Intended uses & limitations
This model can be used to transcribe Dutch or Flemish spoken dutch to text (without punctuation).
xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co is an AI model on huggingface.co that provides xls-r-2b-nl-v2_lm-5gram-os2_hunspell's model effect (), which can be used instantly with this FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell model. huggingface.co supports a free trial of the xls-r-2b-nl-v2_lm-5gram-os2_hunspell model, and also provides paid use of the xls-r-2b-nl-v2_lm-5gram-os2_hunspell. Support call xls-r-2b-nl-v2_lm-5gram-os2_hunspell model through api, including Node.js, Python, http.
xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co is an online trial and call api platform, which integrates xls-r-2b-nl-v2_lm-5gram-os2_hunspell's modeling effects, including api services, and provides a free online trial of xls-r-2b-nl-v2_lm-5gram-os2_hunspell, you can try xls-r-2b-nl-v2_lm-5gram-os2_hunspell online for free by clicking the link below.
FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell online free url in huggingface.co:
xls-r-2b-nl-v2_lm-5gram-os2_hunspell is an open source model from GitHub that offers a free installation service, and any user can find xls-r-2b-nl-v2_lm-5gram-os2_hunspell on GitHub to install. At the same time, huggingface.co provides the effect of xls-r-2b-nl-v2_lm-5gram-os2_hunspell install, users can directly use xls-r-2b-nl-v2_lm-5gram-os2_hunspell installed effect in huggingface.co for debugging and trial. It also supports api for free installation.
xls-r-2b-nl-v2_lm-5gram-os2_hunspell install url in huggingface.co: