Introduction of electra-base-turkish-cased-discriminator
Model Details of electra-base-turkish-cased-discriminator
🤗 + 📚 dbmdz Turkish ELECTRA model
In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State
Library open sources a cased ELECTRA base model for Turkish 🎉
Turkish ELECTRA model
We release a base ELEC
TR
A model for Turkish, that was trained on the same data as
BERTurk
.
ELECTRA is a new method for self-supervised language representation learning. It can be used to
pre-train transformer networks using relatively little compute. ELECTRA models are trained to
distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to
the discriminator of a GAN.
The current version of the model is trained on a filtered and sentence
segmented version of the Turkish
OSCAR corpus
,
a recent Wikipedia dump, various
OPUS corpora
and a
special corpus provided by
Kemal Oflazer
.
The final training corpus has a size of 35GB and 44,04,976,662 tokens.
Thanks to Google's TensorFlow Research Cloud (TFRC) we could train a cased model
on a TPU v3-8 for 1M steps.
Model weights
Transformers
compatible weights for both PyTorch and TensorFlow are available.
For questions about our ELECTRA models just open an issue
here
🤗
Acknowledgments
Thanks to
Kemal Oflazer
for providing us
additional large corpora for Turkish. Many thanks to Reyyan Yeniterzi for providing
us the Turkish NER dataset for evaluation.
Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC).
Thanks for providing access to the TFRC ❤️
Thanks to the generous support from the
Hugging Face
team,
it is possible to download both cased and uncased models from their S3 storage 🤗
Runs of dbmdz electra-base-turkish-cased-discriminator on huggingface.co
237
Total runs
0
24-hour runs
6
3-day runs
-3
7-day runs
-79
30-day runs
More Information About electra-base-turkish-cased-discriminator huggingface.co Model
More electra-base-turkish-cased-discriminator license Visit here:
electra-base-turkish-cased-discriminator huggingface.co is an AI model on huggingface.co that provides electra-base-turkish-cased-discriminator's model effect (), which can be used instantly with this dbmdz electra-base-turkish-cased-discriminator model. huggingface.co supports a free trial of the electra-base-turkish-cased-discriminator model, and also provides paid use of the electra-base-turkish-cased-discriminator. Support call electra-base-turkish-cased-discriminator model through api, including Node.js, Python, http.
electra-base-turkish-cased-discriminator huggingface.co is an online trial and call api platform, which integrates electra-base-turkish-cased-discriminator's modeling effects, including api services, and provides a free online trial of electra-base-turkish-cased-discriminator, you can try electra-base-turkish-cased-discriminator online for free by clicking the link below.
dbmdz electra-base-turkish-cased-discriminator online free url in huggingface.co:
electra-base-turkish-cased-discriminator is an open source model from GitHub that offers a free installation service, and any user can find electra-base-turkish-cased-discriminator on GitHub to install. At the same time, huggingface.co provides the effect of electra-base-turkish-cased-discriminator install, users can directly use electra-base-turkish-cased-discriminator installed effect in huggingface.co for debugging and trial. It also supports api for free installation.
electra-base-turkish-cased-discriminator install url in huggingface.co: