ai4bharat / indictrans2-indic-indic-dist-320M

huggingface.co
Total runs: 732
24-hour runs: -17
7-day runs: -2.0K
30-day runs: -7.2K
Model's Last Updated: Janeiro 18 2025
translation

Introduction of indictrans2-indic-indic-dist-320M

Model Details of indictrans2-indic-indic-dist-320M

IndicTrans2

This is the model card of IndicTrans2 Indic-Indic Distilled 320M variant adapted after stitching Indic-En Distilled 200M and En-Indic Distilled 200M variants.

Please refer to the blog for further details on model training, data and metrics.

Usage Instructions

Please refer to the github repository for a detail description on how to use HF compatible IndicTrans2 models for inference.

import torch
from transformers import (
    AutoModelForSeq2SeqLM,
    AutoTokenizer,
)
from IndicTransTokenizer import IndicProcessor


model_name = "ai4bharat/indictrans2-indic-indic-dist-320M"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

model = AutoModelForSeq2SeqLM.from_pretrained(model_name, trust_remote_code=True)

ip = IndicProcessor(inference=True)

input_sentences = [
    "जब मैं छोटा था, मैं हर रोज़ पार्क जाता था।",
    "हमने पिछले सप्ताह एक नई फिल्म देखी जो कि बहुत प्रेरणादायक थी।",
    "अगर तुम मुझे उस समय पास मिलते, तो हम बाहर खाना खाने चलते।",
    "मेरे मित्र ने मुझे उसके जन्मदिन की पार्टी में बुलाया है, और मैं उसे एक तोहफा दूंगा।",
]

src_lang, tgt_lang = "hin_Deva", "tam_Taml"

batch = ip.preprocess_batch(
    input_sentences,
    src_lang=src_lang,
    tgt_lang=tgt_lang,
)

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

# Tokenize the sentences and generate input encodings
inputs = tokenizer(
    batch,
    truncation=True,
    padding="longest",
    return_tensors="pt",
    return_attention_mask=True,
).to(DEVICE)

# Generate translations using the model
with torch.no_grad():
    generated_tokens = model.generate(
        **inputs,
        use_cache=True,
        min_length=0,
        max_length=256,
        num_beams=5,
        num_return_sequences=1,
    )

# Decode the generated tokens into text
with tokenizer.as_target_tokenizer():
    generated_tokens = tokenizer.batch_decode(
        generated_tokens.detach().cpu().tolist(),
        skip_special_tokens=True,
        clean_up_tokenization_spaces=True,
    )

# Postprocess the translations, including entity replacement
translations = ip.postprocess_batch(generated_tokens, lang=tgt_lang)

for input_sentence, translation in zip(input_sentences, translations):
    print(f"{src_lang}: {input_sentence}")
    print(f"{tgt_lang}: {translation}")

Note: IndicTrans2 is now compatible with AutoTokenizer, however you need to use IndicProcessor from IndicTransTokenizer for preprocessing before tokenization.

Citation

If you consider using our work then please cite using:

@article{gala2023indictrans,
title={IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages},
author={Jay Gala and Pranjal A Chitale and A K Raghavan and Varun Gumma and Sumanth Doddapaneni and Aswanth Kumar M and Janki Atul Nawale and Anupama Sujatha and Ratish Puduppully and Vivek Raghavan and Pratyush Kumar and Mitesh M Khapra and Raj Dabre and Anoop Kunchukuttan},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2023},
url={https://openreview.net/forum?id=vfT4YuzAYA},
note={}
}

Runs of ai4bharat indictrans2-indic-indic-dist-320M on huggingface.co

732
Total runs
-17
24-hour runs
-102
3-day runs
-2.0K
7-day runs
-7.2K
30-day runs

More Information About indictrans2-indic-indic-dist-320M huggingface.co Model

More indictrans2-indic-indic-dist-320M license Visit here:

https://choosealicense.com/licenses/mit

indictrans2-indic-indic-dist-320M huggingface.co

indictrans2-indic-indic-dist-320M huggingface.co is an AI model on huggingface.co that provides indictrans2-indic-indic-dist-320M's model effect (), which can be used instantly with this ai4bharat indictrans2-indic-indic-dist-320M model. huggingface.co supports a free trial of the indictrans2-indic-indic-dist-320M model, and also provides paid use of the indictrans2-indic-indic-dist-320M. Support call indictrans2-indic-indic-dist-320M model through api, including Node.js, Python, http.

indictrans2-indic-indic-dist-320M huggingface.co Url

https://huggingface.co/ai4bharat/indictrans2-indic-indic-dist-320M

ai4bharat indictrans2-indic-indic-dist-320M online free

indictrans2-indic-indic-dist-320M huggingface.co is an online trial and call api platform, which integrates indictrans2-indic-indic-dist-320M's modeling effects, including api services, and provides a free online trial of indictrans2-indic-indic-dist-320M, you can try indictrans2-indic-indic-dist-320M online for free by clicking the link below.

ai4bharat indictrans2-indic-indic-dist-320M online free url in huggingface.co:

https://huggingface.co/ai4bharat/indictrans2-indic-indic-dist-320M

indictrans2-indic-indic-dist-320M install

indictrans2-indic-indic-dist-320M is an open source model from GitHub that offers a free installation service, and any user can find indictrans2-indic-indic-dist-320M on GitHub to install. At the same time, huggingface.co provides the effect of indictrans2-indic-indic-dist-320M install, users can directly use indictrans2-indic-indic-dist-320M installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

indictrans2-indic-indic-dist-320M install url in huggingface.co:

https://huggingface.co/ai4bharat/indictrans2-indic-indic-dist-320M

Url of indictrans2-indic-indic-dist-320M

indictrans2-indic-indic-dist-320M huggingface.co Url

Provider of indictrans2-indic-indic-dist-320M huggingface.co

ai4bharat
ORGANIZATIONS

Other API from ai4bharat

huggingface.co

Total runs: 1.1M
Run Growth: 656.1K
Growth Rate: 59.59%
Updated: Agosto 07 2022
huggingface.co

Total runs: 49.5K
Run Growth: 3.8K
Growth Rate: 7.58%
Updated: Dezembro 21 2022
huggingface.co

Total runs: 1.7K
Run Growth: 295
Growth Rate: 16.88%
Updated: Agosto 07 2022
huggingface.co

Total runs: 1.1K
Run Growth: -3.1K
Growth Rate: -278.75%
Updated: Março 11 2024
huggingface.co

Total runs: 223
Run Growth: -81
Growth Rate: -23.62%
Updated: Outubro 18 2024
huggingface.co

Total runs: 24
Run Growth: 9
Growth Rate: 42.86%
Updated: Outubro 18 2024
huggingface.co

Total runs: 18
Run Growth: 7
Growth Rate: 30.43%
Updated: Outubro 18 2024
huggingface.co

Total runs: 8
Run Growth: -3
Growth Rate: -11.54%
Updated: Outubro 18 2024
huggingface.co

Total runs: 3
Run Growth: 17
Growth Rate: 56.67%
Updated: Outubro 18 2024
huggingface.co

Total runs: 3
Run Growth: 10
Growth Rate: 40.00%
Updated: Outubro 18 2024