CodeT5+
is a new family of open code large language models with an encoder-decoder architecture that can flexibly operate in different modes (i.e.
encoder-only
,
decoder-only
, and
encoder-decoder
) to support a wide range of code understanding and generation tasks.
It is introduced in the paper:
Compared to the original CodeT5 family (base:
220M
, large:
770M
), CodeT5+ is pretrained with a diverse set of pretraining tasks including
span denoising
,
causal language modeling
,
contrastive learning
, and
text-code matching
to learn rich representations from both unimodal code data and bimodal code-text data.
Additionally, it employs a simple yet effective
compute-efficient pretraining
method to initialize the model components with frozen off-the-shelf LLMs such as
CodeGen
to efficiently scale up the model (i.e.
2B
,
6B
,
16B
), and adopts a "shallow encoder and deep decoder" architecture.
Furthermore, it is instruction-tuned to align with natural language instructions (see our InstructCodeT5+ 16B) following
Code Alpaca
.
How to use
This model can be easily loaded using the
T5ForConditionalGeneration
functionality and employs the same tokenizer as original
CodeT5
.
from transformers import T5ForConditionalGeneration, AutoTokenizer
checkpoint = "Salesforce/codet5p-220m"
device = "cuda"# for GPU usage or "cpu" for CPU usage
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = T5ForConditionalGeneration.from_pretrained(checkpoint).to(device)
inputs = tokenizer.encode("def print_hello_world():<extra_id_0>", return_tensors="pt").to(device)
outputs = model.generate(inputs, max_length=10)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# ==> print "Hello World"
Pretraining data
This checkpoint is trained on the stricter permissive subset of the deduplicated version of the
github-code dataset
.
The data is preprocessed by reserving only permissively licensed code ("mit" “apache-2”, “bsd-3-clause”, “bsd-2-clause”, “cc0-1.0”, “unlicense”, “isc”).
Supported languages (9 in total) are as follows:
c
,
c++
,
c-sharp
,
go
,
java
,
javascript
,
php
,
python
,
ruby.
Training procedure
This checkpoint is trained on the unimodal code data at the first-stage pretraining, which includes a diverse set of pretraining tasks including
span denoising
and two variants of
causal language modeling
.
Please refer to the paper for more details.
Evaluation results
CodeT5+ models have been comprehensively evaluated on a wide range of code understanding and generation tasks in various settings:
zero-shot
,
finetuning
, and
instruction-tuning
.
Specifically, CodeT5+ yields substantial performance gains on many downstream tasks compared to their SoTA baselines, e.g.,
8 text-to-code retrieval tasks (+3.2 avg. MRR), 2 line-level code completion tasks (+2.1 avg. Exact Match), and 2 retrieval-augmented code generation tasks (+5.8 avg. BLEU-4).
In 2 math programming tasks on MathQA-Python and GSM8K-Python, CodeT5+ models of below billion-parameter sizes significantly outperform many LLMs of up to 137B parameters.
Particularly, in the zero-shot text-to-code generation task on HumanEval benchmark, InstructCodeT5+ 16B sets new SoTA results of 35.0% pass@1 and 54.5% pass@10 against other open code LLMs, even surpassing the closed-source OpenAI code-cushman-001 mode
Please refer to the
paper
for more details.
BibTeX entry and citation info
@article{wang2023codet5plus,
title={CodeT5+: Open Code Large Language Models for Code Understanding and Generation},
author={Wang, Yue and Le, Hung and Gotmare, Akhilesh Deepak and Bui, Nghi D.Q. and Li, Junnan and Hoi, Steven C. H.},
journal={arXiv preprint},
year={2023}
}
Runs of Salesforce codet5p-220m on huggingface.co
9.3K
Total runs
0
24-hour runs
40
3-day runs
289
7-day runs
3.9K
30-day runs
More Information About codet5p-220m huggingface.co Model
codet5p-220m huggingface.co is an AI model on huggingface.co that provides codet5p-220m's model effect (), which can be used instantly with this Salesforce codet5p-220m model. huggingface.co supports a free trial of the codet5p-220m model, and also provides paid use of the codet5p-220m. Support call codet5p-220m model through api, including Node.js, Python, http.
codet5p-220m huggingface.co is an online trial and call api platform, which integrates codet5p-220m's modeling effects, including api services, and provides a free online trial of codet5p-220m, you can try codet5p-220m online for free by clicking the link below.
Salesforce codet5p-220m online free url in huggingface.co:
codet5p-220m is an open source model from GitHub that offers a free installation service, and any user can find codet5p-220m on GitHub to install. At the same time, huggingface.co provides the effect of codet5p-220m install, users can directly use codet5p-220m installed effect in huggingface.co for debugging and trial. It also supports api for free installation.