dbmdz / electra-base-turkish-cased-discriminator

huggingface.co
Total runs: 237
24-hour runs: 0
7-day runs: -3
30-day runs: -79
Model's Last Updated: October 28 2024

Introduction of electra-base-turkish-cased-discriminator

Model Details of electra-base-turkish-cased-discriminator

🤗 + 📚 dbmdz Turkish ELECTRA model

In this repository the MDZ Digital Library team (dbmdz) at the Bavarian State Library open sources a cased ELECTRA base model for Turkish 🎉

Turkish ELECTRA model

We release a base ELEC TR A model for Turkish, that was trained on the same data as BERTurk .

ELECTRA is a new method for self-supervised language representation learning. It can be used to pre-train transformer networks using relatively little compute. ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to the discriminator of a GAN.

More details about ELECTRA can be found in the ICLR paper or in the official ELECTRA repository on GitHub.

Stats

The current version of the model is trained on a filtered and sentence segmented version of the Turkish OSCAR corpus , a recent Wikipedia dump, various OPUS corpora and a special corpus provided by Kemal Oflazer .

The final training corpus has a size of 35GB and 44,04,976,662 tokens.

Thanks to Google's TensorFlow Research Cloud (TFRC) we could train a cased model on a TPU v3-8 for 1M steps.

Model weights

Transformers compatible weights for both PyTorch and TensorFlow are available.

Model Downloads
dbmdz/electra-base-turkish-cased-discriminator config.json pytorch_model.bin vocab.txt
Usage

With Transformers >= 2.8 our ELECTRA base cased model can be loaded like:

from transformers import AutoModelWithLMHead, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("dbmdz/electra-base-turkish-cased-discriminator")
model = AutoModelWithLMHead.from_pretrained("dbmdz/electra-base-turkish-cased-discriminator")
Results

For results on PoS tagging or NER tasks, please refer to this repository .

Huggingface model hub

All models are available on the Huggingface model hub .

Contact (Bugs, Feedback, Contribution and more)

For questions about our ELECTRA models just open an issue here 🤗

Acknowledgments

Thanks to Kemal Oflazer for providing us additional large corpora for Turkish. Many thanks to Reyyan Yeniterzi for providing us the Turkish NER dataset for evaluation.

Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC). Thanks for providing access to the TFRC ❤️

Thanks to the generous support from the Hugging Face team, it is possible to download both cased and uncased models from their S3 storage 🤗

Runs of dbmdz electra-base-turkish-cased-discriminator on huggingface.co

237
Total runs
0
24-hour runs
6
3-day runs
-3
7-day runs
-79
30-day runs

More Information About electra-base-turkish-cased-discriminator huggingface.co Model

More electra-base-turkish-cased-discriminator license Visit here:

https://choosealicense.com/licenses/mit

electra-base-turkish-cased-discriminator huggingface.co

electra-base-turkish-cased-discriminator huggingface.co is an AI model on huggingface.co that provides electra-base-turkish-cased-discriminator's model effect (), which can be used instantly with this dbmdz electra-base-turkish-cased-discriminator model. huggingface.co supports a free trial of the electra-base-turkish-cased-discriminator model, and also provides paid use of the electra-base-turkish-cased-discriminator. Support call electra-base-turkish-cased-discriminator model through api, including Node.js, Python, http.

electra-base-turkish-cased-discriminator huggingface.co Url

https://huggingface.co/dbmdz/electra-base-turkish-cased-discriminator

dbmdz electra-base-turkish-cased-discriminator online free

electra-base-turkish-cased-discriminator huggingface.co is an online trial and call api platform, which integrates electra-base-turkish-cased-discriminator's modeling effects, including api services, and provides a free online trial of electra-base-turkish-cased-discriminator, you can try electra-base-turkish-cased-discriminator online for free by clicking the link below.

dbmdz electra-base-turkish-cased-discriminator online free url in huggingface.co:

https://huggingface.co/dbmdz/electra-base-turkish-cased-discriminator

electra-base-turkish-cased-discriminator install

electra-base-turkish-cased-discriminator is an open source model from GitHub that offers a free installation service, and any user can find electra-base-turkish-cased-discriminator on GitHub to install. At the same time, huggingface.co provides the effect of electra-base-turkish-cased-discriminator install, users can directly use electra-base-turkish-cased-discriminator installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

electra-base-turkish-cased-discriminator install url in huggingface.co:

https://huggingface.co/dbmdz/electra-base-turkish-cased-discriminator

Url of electra-base-turkish-cased-discriminator

electra-base-turkish-cased-discriminator huggingface.co Url

Provider of electra-base-turkish-cased-discriminator huggingface.co

dbmdz
ORGANIZATIONS

Other API from dbmdz

huggingface.co

Total runs: 9.1K
Run Growth: 371
Growth Rate: 4.10%
Updated: December 14 2023