Using this model becomes easy when you have
text2vec
installed:
pip install -U text2vec
Then you can use the model like this:
from text2vec import SentenceModel
sentences = ['如何更换花呗绑定银行卡', '花呗更改绑定银行卡']
model = SentenceModel('shibing624/text2vec-base-chinese-sentence')
embeddings = model.encode(sentences)
print(embeddings)
Usage (HuggingFace Transformers)
Without
text2vec
, you can use the model like this:
First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
Install transformers:
pip install transformers
Then load model and predict:
from transformers import BertTokenizer, BertModel
import torch
# Mean Pooling - Take attention mask into account for correct averagingdefmean_pooling(model_output, attention_mask):
token_embeddings = model_output[0] # First element of model_output contains all token embeddings
input_mask_expanded = attention_mask.unsqueeze(-1).expand(token_embeddings.size()).float()
return torch.sum(token_embeddings * input_mask_expanded, 1) / torch.clamp(input_mask_expanded.sum(1), min=1e-9)
# Load model from HuggingFace Hub
tokenizer = BertTokenizer.from_pretrained('shibing624/text2vec-base-chinese-sentence')
model = BertModel.from_pretrained('shibing624/text2vec-base-chinese-sentence')
sentences = ['如何更换花呗绑定银行卡', '花呗更改绑定银行卡']
# Tokenize sentences
encoded_input = tokenizer(sentences, padding=True, truncation=True, return_tensors='pt')
# Compute token embeddingswith torch.no_grad():
model_output = model(**encoded_input)
# Perform pooling. In this case, mean pooling.
sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask'])
print("Sentence embeddings:")
print(sentence_embeddings)
Usage (sentence-transformers)
sentence-transformers
is a popular library to compute dense vector representations for sentences.
Install sentence-transformers:
pip install -U sentence-transformers
Then load model and predict:
from sentence_transformers import SentenceTransformer
m = SentenceTransformer("shibing624/text2vec-base-chinese-sentence")
sentences = ['如何更换花呗绑定银行卡', '花呗更改绑定银行卡']
sentence_embeddings = m.encode(sentences)
print("Sentence embeddings:")
print(sentence_embeddings)
Our model is intented to be used as a sentence and short paragraph encoder. Given an input text, it ouptuts a vector which captures
the semantic information. The sentence vector may be used for information retrieval, clustering or sentence similarity tasks.
By default, input text longer than 256 word pieces is truncated.
Training procedure
Pre-training
We use the pretrained
nghuyong/ernie-3.0-base-zh
model.
Please refer to the model card for more detailed information about the pre-training procedure.
Fine-tuning
We fine-tune the model using a contrastive objective. Formally, we compute the cosine similarity from each
possible sentence pairs from the batch.
We then apply the rank loss by comparing with true pairs and false pairs.
If you find this model helpful, feel free to cite:
@software{text2vec,
author = {Ming Xu},
title = {text2vec: A Tool for Text to Vector},
year = {2023},
url = {https://github.com/shibing624/text2vec},
}
Runs of shibing624 text2vec-base-chinese-sentence on huggingface.co
2.9K
Total runs
-6
24-hour runs
-11
3-day runs
-33
7-day runs
513
30-day runs
More Information About text2vec-base-chinese-sentence huggingface.co Model
More text2vec-base-chinese-sentence license Visit here:
text2vec-base-chinese-sentence huggingface.co is an AI model on huggingface.co that provides text2vec-base-chinese-sentence's model effect (), which can be used instantly with this shibing624 text2vec-base-chinese-sentence model. huggingface.co supports a free trial of the text2vec-base-chinese-sentence model, and also provides paid use of the text2vec-base-chinese-sentence. Support call text2vec-base-chinese-sentence model through api, including Node.js, Python, http.
text2vec-base-chinese-sentence huggingface.co is an online trial and call api platform, which integrates text2vec-base-chinese-sentence's modeling effects, including api services, and provides a free online trial of text2vec-base-chinese-sentence, you can try text2vec-base-chinese-sentence online for free by clicking the link below.
shibing624 text2vec-base-chinese-sentence online free url in huggingface.co:
text2vec-base-chinese-sentence is an open source model from GitHub that offers a free installation service, and any user can find text2vec-base-chinese-sentence on GitHub to install. At the same time, huggingface.co provides the effect of text2vec-base-chinese-sentence install, users can directly use text2vec-base-chinese-sentence installed effect in huggingface.co for debugging and trial. It also supports api for free installation.
text2vec-base-chinese-sentence install url in huggingface.co: