Salesforce / codet5p-220m-bimodal

huggingface.co
Total runs: 1.3K
24-hour runs: 61
7-day runs: 246
30-day runs: -6.9K
Model's Last Updated: July 19 2023
feature-extraction

Introduction of codet5p-220m-bimodal

Model Details of codet5p-220m-bimodal

CodeT5+ 220M Bimodal Models

Model description

CodeT5+ is a new family of open code large language models with an encoder-decoder architecture that can flexibly operate in different modes (i.e. encoder-only , decoder-only , and encoder-decoder ) to support a wide range of code understanding and generation tasks. It is introduced in the paper:

CodeT5+: Open Code Large Language Models for Code Understanding and Generation by Yue Wang *, Hung Le *, Akhilesh Deepak Gotmare , Nghi D.Q. Bui , Junnan Li , Steven C.H. Hoi (* indicates equal contribution).

Compared to the original CodeT5 family (base: 220M , large: 770M ), CodeT5+ is pretrained with a diverse set of pretraining tasks including span denoising , causal language modeling , contrastive learning , and text-code matching to learn rich representations from both unimodal code data and bimodal code-text data. Additionally, it employs a simple yet effective compute-efficient pretraining method to initialize the model components with frozen off-the-shelf LLMs such as CodeGen to efficiently scale up the model (i.e. 2B , 6B , 16B ), and adopts a "shallow encoder and deep decoder" architecture. Furthermore, it is instruction-tuned to align with natural language instructions (see our InstructCodeT5+ 16B) following Code Alpaca .

How to use

This model can be easily loaded using the AutoModel functionality and employs the CodeT5 tokenizer with three special tokens added ( [ENC] , [TDEC] , [CDEC] ). This checkpoint consists of a CodeT5+ 220M model and a projection layer and an itm_head layer for text-code matching.

from transformers import AutoModel, AutoTokenizer

checkpoint = "Salesforce/codet5p-220m-bimodal"
device = "cuda"  # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint, trust_remote_code=True)
model = AutoModel.from_pretrained(checkpoint, trust_remote_code=True).to(device)
Pretraining data

This checkpoint is trained on the stricter permissive subset of the deduplicated version of the github-code dataset . The data is preprocessed by reserving only permissively licensed code ("mit" “apache-2”, “bsd-3-clause”, “bsd-2-clause”, “cc0-1.0”, “unlicense”, “isc”). Supported languages (9 in total) are as follows: c , c++ , c-sharp , go , java , javascript , php , python , ruby.

Training procedure

This checkpoint is first trained on the unimodal code data at the first-stage pretraining and then on bimodal text-code pair data using the proposed mixture of pretraining tasks. Please refer to the paper for more details.

Evaluation results

Please refer to the paper and the official GitHub repo for more details.

BibTeX entry and citation info
@article{wang2023codet5plus,
  title={CodeT5+: Open Code Large Language Models for Code Understanding and Generation},
  author={Wang, Yue and Le, Hung and Gotmare, Akhilesh Deepak and Bui, Nghi D.Q. and Li, Junnan and Hoi, Steven C. H.},
  journal={arXiv preprint},
  year={2023}
}

Runs of Salesforce codet5p-220m-bimodal on huggingface.co

1.3K
Total runs
61
24-hour runs
38
3-day runs
246
7-day runs
-6.9K
30-day runs

More Information About codet5p-220m-bimodal huggingface.co Model

More codet5p-220m-bimodal license Visit here:

https://choosealicense.com/licenses/bsd-3-clause

codet5p-220m-bimodal huggingface.co

codet5p-220m-bimodal huggingface.co is an AI model on huggingface.co that provides codet5p-220m-bimodal's model effect (), which can be used instantly with this Salesforce codet5p-220m-bimodal model. huggingface.co supports a free trial of the codet5p-220m-bimodal model, and also provides paid use of the codet5p-220m-bimodal. Support call codet5p-220m-bimodal model through api, including Node.js, Python, http.

codet5p-220m-bimodal huggingface.co Url

https://huggingface.co/Salesforce/codet5p-220m-bimodal

Salesforce codet5p-220m-bimodal online free

codet5p-220m-bimodal huggingface.co is an online trial and call api platform, which integrates codet5p-220m-bimodal's modeling effects, including api services, and provides a free online trial of codet5p-220m-bimodal, you can try codet5p-220m-bimodal online for free by clicking the link below.

Salesforce codet5p-220m-bimodal online free url in huggingface.co:

https://huggingface.co/Salesforce/codet5p-220m-bimodal

codet5p-220m-bimodal install

codet5p-220m-bimodal is an open source model from GitHub that offers a free installation service, and any user can find codet5p-220m-bimodal on GitHub to install. At the same time, huggingface.co provides the effect of codet5p-220m-bimodal install, users can directly use codet5p-220m-bimodal installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

codet5p-220m-bimodal install url in huggingface.co:

https://huggingface.co/Salesforce/codet5p-220m-bimodal

Url of codet5p-220m-bimodal

codet5p-220m-bimodal huggingface.co Url

Provider of codet5p-220m-bimodal huggingface.co

Salesforce
ORGANIZATIONS

Other API from Salesforce

huggingface.co

Total runs: 42.2K
Run Growth: 23.3K
Growth Rate: 55.30%
Updated: November 23 2021
huggingface.co

Total runs: 35.8K
Run Growth: -56.7K
Growth Rate: -158.24%
Updated: November 23 2021
huggingface.co

Total runs: 7.4K
Run Growth: 1.3K
Growth Rate: 17.09%
Updated: February 19 2024
huggingface.co

Total runs: 931
Run Growth: 358
Growth Rate: 38.45%
Updated: October 19 2021
huggingface.co

Total runs: 850
Run Growth: -1.1K
Growth Rate: -131.41%
Updated: August 04 2023
huggingface.co

Total runs: 370
Run Growth: -56
Growth Rate: -15.14%
Updated: August 04 2023
huggingface.co

Total runs: 178
Run Growth: -207
Growth Rate: -118.97%
Updated: September 24 2024
huggingface.co

Total runs: 16
Run Growth: -8
Growth Rate: -50.00%
Updated: November 11 2022