xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co api & FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell github AI Model

Introduction of xls-r-2b-nl-v2_lm-5gram-os2_hunspell

Model Details of xls-r-2b-nl-v2_lm-5gram-os2_hunspell

XLS-R-based CTC model with 5-gram language model from Open Subtitles

This model is a version of facebook/wav2vec2-xls-r-2b-22-to-16 fine-tuned mainly on the CGN dataset , as well as the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - NL dataset (see details below), on which a large 5-gram language model is added based on the Open Subtitles Dutch corpus. This model achieves the following results on the evaluation set (of Common Voice 8.0):

Wer: 0.03931
Cer: 0.01224

IMPORTANT NOTE : The hunspell typo fixer is not enabled on the website, which returns raw CTC+LM results. Hunspell reranking is only available in the eval.py decoding script. For best results, please use the code in that file while using the model locally for inference.

IMPORTANT NOTE : Evaluating this model requires apt install libhunspell-dev and a pip install of hunspell in addition to pip installs of pipy-kenlm and pyctcdecode (see install_requirements.sh ); in addition, the chunking lengths and strides were optimized for the model as 12s and 2s respectively (see eval.sh ).

QUICK REMARK : The "Robust Speech Event" set does not contain cleaned transcription text, so its WER/CER are vastly over-estimated. For instance 2014 in the dev set is left as a number but will be recognized as tweeduizend veertien , which counts as 3 mistakes ( 2014 missing, and both tweeduizend and veertien wrongly inserted). Other normalization problems in the dev set include the presence of single quotes around some words, that then end up as non-match despite being the correct word (but without quotes), and the removal of some speech words in the final transcript ( ja , etc...). As a result, our real error rate on the dev set is significantly lower than reported.

You can compare the predictions with the targets on the validation dev set yourself, for example using this diffing tool .

WE DO SPEECH RECOGNITION : Hello reader! If you are considering using this (or another) model in production, but would benefit from a model fine-tuned specifically for your use case (using text and/or labelled speech), feel free to contact our team . This model was developped during the Robust Speech Recognition challenge event by François REMY (twitter) and Geoffroy VANDERREYDT .

We would like to thank OVH for providing us with a V100S GPU.

Model description

The model takes 16kHz sound input, and uses a Wav2Vec2ForCTC decoder with 48 letters to output the letter-transcription probabilities per frame.

To improve accuracy, a beam-search decoder based on pyctcdecode is then used; it reranks the most promising alignments based on a 5-gram language model trained on the Open Subtitles Dutch corpus.

To further deal with typos, hunspell is used to propose alternative spellings for words not in the unigrams of the language model. These alternatives are then reranked based on the language model trained above, and a penalty proportional to the levenshtein edit distance between the alternative and the recognized word. This for examples enables to correct collegas into collega's or gogol into google .

Intended uses & limitations

This model can be used to transcribe Dutch or Flemish spoken dutch to text (without punctuation).

Training and evaluation data

The model was:

initialized with the 2B parameter model from Facebook .
trained 5 epochs (6000 iterations of batch size 32) on the cv8/nl dataset .
trained 1 epoch (36000 iterations of batch size 32) on the cgn dataset .
trained 5 epochs (6000 iterations of batch size 32) on the cv8/nl dataset .

Framework versions

Transformers 4.16.0
Pytorch 1.10.2+cu102
Datasets 1.18.3
Tokenizers 0.11.0

Runs of FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell on huggingface.co

Total runs

24-hour runs

3-day runs

-1

7-day runs

30-day runs

More Information About xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co Model

xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co

xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co is an AI model on huggingface.co that provides xls-r-2b-nl-v2_lm-5gram-os2_hunspell's model effect (), which can be used instantly with this FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell model. huggingface.co supports a free trial of the xls-r-2b-nl-v2_lm-5gram-os2_hunspell model, and also provides paid use of the xls-r-2b-nl-v2_lm-5gram-os2_hunspell. Support call xls-r-2b-nl-v2_lm-5gram-os2_hunspell model through api, including Node.js, Python, http.

xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co Url

https://huggingface.co/FremyCompany/xls-r-2b-nl-v2_lm-5gram-os2_hunspell

FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell online free

xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co is an online trial and call api platform, which integrates xls-r-2b-nl-v2_lm-5gram-os2_hunspell's modeling effects, including api services, and provides a free online trial of xls-r-2b-nl-v2_lm-5gram-os2_hunspell, you can try xls-r-2b-nl-v2_lm-5gram-os2_hunspell online for free by clicking the link below.

FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell online free url in huggingface.co:

https://huggingface.co/FremyCompany/xls-r-2b-nl-v2_lm-5gram-os2_hunspell

xls-r-2b-nl-v2_lm-5gram-os2_hunspell install

xls-r-2b-nl-v2_lm-5gram-os2_hunspell is an open source model from GitHub that offers a free installation service, and any user can find xls-r-2b-nl-v2_lm-5gram-os2_hunspell on GitHub to install. At the same time, huggingface.co provides the effect of xls-r-2b-nl-v2_lm-5gram-os2_hunspell install, users can directly use xls-r-2b-nl-v2_lm-5gram-os2_hunspell installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

xls-r-2b-nl-v2_lm-5gram-os2_hunspell install url in huggingface.co:

https://huggingface.co/FremyCompany/xls-r-2b-nl-v2_lm-5gram-os2_hunspell

huggingface.co

FremyCompany/BioLORD-2023-C

Total runs: 42.5K

Run Growth: 19.1K

Growth Rate: 44.86%

Updated: Tháng Một 09 2025

huggingface.co

FremyCompany/BioLORD-2023

Total runs: 26.6K

Run Growth: -29.7K

Growth Rate: -111.86%

Updated: Tháng Một 09 2025

huggingface.co

FremyCompany/BioLORD-2023-M

Total runs: 737

Run Growth: -9.3K

Growth Rate: -1262.14%

Updated: Tháng Một 09 2025

huggingface.co

FremyCompany/BioLORD-2023-S

Total runs: 511

Run Growth: 467

Growth Rate: 91.39%

Updated: Tháng hai 28 2024

huggingface.co

FremyCompany/BioLORD-2023-M-Dutch-InContext-v1

Total runs: 273

Run Growth: 236

Growth Rate: 86.45%

Updated: Tháng sáu 24 2024

huggingface.co

FremyCompany/olm-bert-oscar-nl-step4

Total runs: 141

Run Growth: 131

Growth Rate: 92.91%

Updated: Bước đều 08 2023

huggingface.co

FremyCompany/gbert-base-nl-oscar19

Total runs: 135

Run Growth: 132

Growth Rate: 99.25%

Updated: Có thể 28 2023

huggingface.co

FremyCompany/roberta-large-nl-oscar2301

Total runs: 135

Run Growth: 133

Growth Rate: 98.52%

Updated: Có thể 21 2023

huggingface.co

FremyCompany/rl-bert-oscar-nl-step4

Total runs: 133

Run Growth: 101

Growth Rate: 75.94%

Updated: Bước đều 22 2023

huggingface.co

FremyCompany/rl-bert-oscar-nl-step1

Total runs: 132

Run Growth: 129

Growth Rate: 99.23%

Updated: Tháng tư 17 2023

huggingface.co

FremyCompany/olm-bert-oscar-nl-step2

Total runs: 131

Run Growth: 126

Growth Rate: 98.44%

Updated: Tháng tư 17 2023

huggingface.co

FremyCompany/opus-mt-nl-en-healthcare

Total runs: 121

Run Growth: 93

Growth Rate: 76.86%

Updated: Tháng sáu 13 2023

huggingface.co

FremyCompany/olm-bert-oscar-nl-step1

Total runs: 117

Run Growth: 114

Growth Rate: 99.13%

Updated: Tháng tư 17 2023

huggingface.co

FremyCompany/camembert-base-nl-oscar19

Total runs: 116

Run Growth: 112

Growth Rate: 97.39%

Updated: Có thể 28 2023

huggingface.co

FremyCompany/rl-bert-oscar-nl-step2

Total runs: 109

Run Growth: 106

Growth Rate: 99.07%

Updated: Tháng tư 17 2023

huggingface.co

FremyCompany/BioLORD-STAMB2-v1

Total runs: 75

Run Growth: -612

Growth Rate: -816.00%

Updated: Tháng mười một 27 2023

huggingface.co

FremyCompany/BioLORD-STAMB2-v1-STS2

Total runs: 61

Run Growth: 48

Growth Rate: 78.69%

Updated: Tháng tám 03 2023

huggingface.co

FremyCompany/stsb_ossts_roberta-large-nl-oscar23

Total runs: 53

Run Growth: -8

Growth Rate: -15.09%

Updated: Tháng Mười 17 2023

huggingface.co

FremyCompany/roberta-large-nl-oscar23

Total runs: 29

Run Growth: 15

Growth Rate: 48.39%

Updated: Tháng 12 05 2023

huggingface.co

FremyCompany/xls-r-2b-nl-v2_lm-5gram-os

Total runs: 21

Run Growth: 17

Growth Rate: 80.95%

Updated: Bước đều 23 2022

huggingface.co

FremyCompany/xls-r-nl-v1-cv8-lm

Total runs: 19

Run Growth: 9

Growth Rate: 47.37%

Updated: Tháng mười một 18 2023

huggingface.co

FremyCompany/roberta-base-nl-oscar23

Total runs: 16

Run Growth: 4

Growth Rate: 25.00%

Updated: Tháng mười một 30 2023

huggingface.co

FremyCompany/BioLORD-PMB

Total runs: 8

Run Growth: 5

Growth Rate: 62.50%

Updated: Tháng tám 21 2023

FremyCompany / xls-r-2b-nl-v2_lm-5gram-os2_hunspell

Introduction of xls-r-2b-nl-v2_lm-5gram-os2_hunspell

Model Details of xls-r-2b-nl-v2_lm-5gram-os2_hunspell

XLS-R-based CTC model with 5-gram language model from Open Subtitles

Model description

Intended uses & limitations

Training and evaluation data

Framework versions

Runs of FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell on huggingface.co

More Information About xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co Model

xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co

xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co Url

FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell online free

FremyCompany xls-r-2b-nl-v2_lm-5gram-os2_hunspell online free url in huggingface.co:

xls-r-2b-nl-v2_lm-5gram-os2_hunspell install

xls-r-2b-nl-v2_lm-5gram-os2_hunspell install url in huggingface.co:

Url of xls-r-2b-nl-v2_lm-5gram-os2_hunspell

xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co Url

Provider of xls-r-2b-nl-v2_lm-5gram-os2_hunspell huggingface.co

Other API from FremyCompany