import torch
from unixcoder import UniXcoder
device = torch.device("cuda"if torch.cuda.is_available() else"cpu")
model = UniXcoder("microsoft/unixcoder-base")
model.to(device)
In the following, we will give zero-shot examples for several tasks under different mode, including
code search (encoder-only)
,
code completion (decoder-only)
,
function name prediction (encoder-decoder)
,
API recommendation (encoder-decoder)
,
code summarization (encoder-decoder)
.
3. Encoder-only Mode
For encoder-only mode, we give an example of
code search
.
1) Code and NL Embeddings
Here, we give an example to obtain code fragment embedding from CodeBERT.
# Encode maximum function
func = "def f(a,b): if a>b: return a else return b"
tokens_ids = model.tokenize([func],max_length=512,mode="<encoder-only>")
source_ids = torch.tensor(tokens_ids).to(device)
tokens_embeddings,max_func_embedding = model(source_ids)
# Encode minimum function
func = "def f(a,b): if a<b: return a else return b"
tokens_ids = model.tokenize([func],max_length=512,mode="<encoder-only>")
source_ids = torch.tensor(tokens_ids).to(device)
tokens_embeddings,min_func_embedding = model(source_ids)
# Encode NL
nl = "return maximum value"
tokens_ids = model.tokenize([nl],max_length=512,mode="<encoder-only>")
source_ids = torch.tensor(tokens_ids).to(device)
tokens_embeddings,nl_embedding = model(source_ids)
print(max_func_embedding.shape)
print(max_func_embedding)
Now, we calculate cosine similarity between NL and two functions. Although the difference of two functions is only a operator (
<
and
>
), UniXcoder can distinguish them.
For decoder-only mode, we give an example of
code completion
.
context = """def f(data,file_path): # write json data into file_path in python language"""
tokens_ids = model.tokenize([context],max_length=512,mode="<decoder-only>")
source_ids = torch.tensor(tokens_ids).to(device)
prediction_ids = model.generate(source_ids, decoder_only=True, beam_size=3, max_length=128)
predictions = model.decode(prediction_ids)
print(context+predictions[0][0])
deff(data,file_path):
# write json data into file_path in python language
data = json.dumps(data)
withopen(file_path, 'w') as f:
f.write(data)
4. Encoder-Decoder Mode
For encoder-decoder mode, we give two examples including:
function name prediction
,
API recommendation
,
code summarization
.
1)
Function Name Prediction
context = """def <mask0>(data,file_path): data = json.dumps(data) with open(file_path, 'w') as f: f.write(data)"""
tokens_ids = model.tokenize([context],max_length=512,mode="<encoder-decoder>")
source_ids = torch.tensor(tokens_ids).to(device)
prediction_ids = model.generate(source_ids, decoder_only=False, beam_size=3, max_length=128)
predictions = model.decode(prediction_ids)
print([x.replace("<mask0>","").strip() for x in predictions[0]])
['write_json', 'write_file', 'to_json']
2) API Recommendation
context = """def write_json(data,file_path): data = <mask0>(data) with open(file_path, 'w') as f: f.write(data)"""
tokens_ids = model.tokenize([context],max_length=512,mode="<encoder-decoder>")
source_ids = torch.tensor(tokens_ids).to(device)
prediction_ids = model.generate(source_ids, decoder_only=False, beam_size=3, max_length=128)
predictions = model.decode(prediction_ids)
print([x.replace("<mask0>","").strip() for x in predictions[0]])
['json.dumps', 'json.loads', 'str']
3) Code Summarization
context = """# <mask0>def write_json(data,file_path): data = json.dumps(data) with open(file_path, 'w') as f: f.write(data)"""
tokens_ids = model.tokenize([context],max_length=512,mode="<encoder-decoder>")
source_ids = torch.tensor(tokens_ids).to(device)
prediction_ids = model.generate(source_ids, decoder_only=False, beam_size=3, max_length=128)
predictions = model.decode(prediction_ids)
print([x.replace("<mask0>","").strip() for x in predictions[0]])
['Write JSON to file', 'Write json to file', 'Write a json file']
Reference
If you use this code or UniXcoder, please consider citing us.
@article{guo2022unixcoder,
title={UniXcoder: Unified Cross-Modal Pre-training for Code Representation},
author={Guo, Daya and Lu, Shuai and Duan, Nan and Wang, Yanlin and Zhou, Ming and Yin, Jian},
journal={arXiv preprint arXiv:2203.03850},
year={2022}
}
Runs of microsoft unixcoder-base-nine on huggingface.co
11.3K
Total runs
0
24-hour runs
151
3-day runs
-651
7-day runs
3.4K
30-day runs
More Information About unixcoder-base-nine huggingface.co Model
unixcoder-base-nine huggingface.co is an AI model on huggingface.co that provides unixcoder-base-nine's model effect (), which can be used instantly with this microsoft unixcoder-base-nine model. huggingface.co supports a free trial of the unixcoder-base-nine model, and also provides paid use of the unixcoder-base-nine. Support call unixcoder-base-nine model through api, including Node.js, Python, http.
unixcoder-base-nine huggingface.co is an online trial and call api platform, which integrates unixcoder-base-nine's modeling effects, including api services, and provides a free online trial of unixcoder-base-nine, you can try unixcoder-base-nine online for free by clicking the link below.
microsoft unixcoder-base-nine online free url in huggingface.co:
unixcoder-base-nine is an open source model from GitHub that offers a free installation service, and any user can find unixcoder-base-nine on GitHub to install. At the same time, huggingface.co provides the effect of unixcoder-base-nine install, users can directly use unixcoder-base-nine installed effect in huggingface.co for debugging and trial. It also supports api for free installation.
unixcoder-base-nine install url in huggingface.co: