IndicNER is a model trained to complete the task of identifying named entities from sentences in Indian languages. Our model is specifically fine-tuned to the 11 Indian languages mentioned above over millions of sentences. The model is then benchmarked over a human annotated testset and multiple other publicly available Indian NER datasets.
The 11 languages covered by IndicNER are: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, Telugu.
Training Corpus
Our model was trained on a
dataset
which we mined from the existing
Samanantar Corpus
. We used a bert-base-multilingual-uncased model as the starting point and then fine-tuned it to the NER dataset mentioned previously.
Downloads
Download from this same Huggingface repo.
Update 20 Dec 2022: We released a new paper documenting IndicNER and Naamapadam. We have a different model reported in the paper. We will update the repo here soon with this model.
Usage
You can use
this Colab notebook
for samples on using IndicNER or for finetuning a pre-trained model on Naampadam dataset to build your own NER models.
Citing
If you are using IndicNER, please cite the following article:
@misc{mhaske2022naamapadam,
doi = {10.48550/ARXIV.2212.10168},
url = {https://arxiv.org/abs/2212.10168},
author = {Mhaske, Arnav and Kedia, Harshit and Doddapaneni, Sumanth and Khapra, Mitesh M. and Kumar, Pratyush and Murthy, Rudra and Kunchukuttan, Anoop},
title = {Naamapadam: A Large-Scale Named Entity Annotated Data for Indic Languages}
publisher = {arXiv},
year = {2022},
copyright = {arXiv.org perpetual, non-exclusive license}
}
We would like to hear from you if:
You are using our resources. Please let us know how you are putting these resources to use.
You have any feedback on these resources.
License
The IndicNER code (and models) are released under the MIT License.
IndicNER huggingface.co is an AI model on huggingface.co that provides IndicNER's model effect (), which can be used instantly with this ai4bharat IndicNER model. huggingface.co supports a free trial of the IndicNER model, and also provides paid use of the IndicNER. Support call IndicNER model through api, including Node.js, Python, http.
IndicNER huggingface.co is an online trial and call api platform, which integrates IndicNER's modeling effects, including api services, and provides a free online trial of IndicNER, you can try IndicNER online for free by clicking the link below.
ai4bharat IndicNER online free url in huggingface.co:
IndicNER is an open source model from GitHub that offers a free installation service, and any user can find IndicNER on GitHub to install. At the same time, huggingface.co provides the effect of IndicNER install, users can directly use IndicNER installed effect in huggingface.co for debugging and trial. It also supports api for free installation.