michaelfeil / ct2fast-all-MiniLM-L12-v2

huggingface.co
Total runs: 4
24-hour runs: -2
7-day runs: -2
30-day runs: -1
Model's Last Updated: Oktober 13 2023

Introduction of ct2fast-all-MiniLM-L12-v2

Model Details of ct2fast-all-MiniLM-L12-v2

'--- pipeline_tag: sentence-similarity tags:

  • ctranslate2
  • int8
  • float16
  • sentence-transformers
  • feature-extraction
  • sentence-similarity language: en license: apache-2.0 datasets:
  • s2orc
  • flax-sentence-embeddings/stackexchange_xml
  • MS Marco
  • gooaq
  • yahoo_answers_topics
  • code_search_net
  • search_qa
  • eli5
  • snli
  • multi_nli
  • wikihow
  • natural_questions
  • trivia_qa
  • embedding-data/sentence-compression
  • embedding-data/flickr30k-captions
  • embedding-data/altlex
  • embedding-data/simple-wiki
  • embedding-data/QQP
  • embedding-data/SPECTER
  • embedding-data/PAQ_pairs
  • embedding-data/WikiAnswers

# Fast-Inference with Ctranslate2

Speedup inference while reducing memory by 2x-4x using int8 inference in C++ on CPU or GPU.

quantized version of sentence-transformers/all-MiniLM-L12-v2

pip install hf-hub-ctranslate2>=2.12.0 ctranslate2>=3.17.1
# from transformers import AutoTokenizer
model_name = "michaelfeil/ct2fast-all-MiniLM-L12-v2"
model_name_orig="sentence-transformers/all-MiniLM-L12-v2"

from hf_hub_ctranslate2 import EncoderCT2fromHfHub
model = EncoderCT2fromHfHub(
        # load in int8 on CUDA
        model_name_or_path=model_name,
        device="cuda",
        compute_type="int8_float16"
)
outputs = model.generate(
    text=["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
    max_length=64,
) # perform downstream tasks on outputs
outputs["pooler_output"]
outputs["last_hidden_state"]
outputs["attention_mask"]

# alternative, use SentenceTransformer Mix-In
# for end-to-end Sentence embeddings generation
# (not pulling from this CT2fast-HF repo)

from hf_hub_ctranslate2 import CT2SentenceTransformer
model = CT2SentenceTransformer(
    model_name_orig, compute_type="int8_float16", device="cuda"
)
embeddings = model.encode(
    ["I like soccer", "I like tennis", "The eiffel tower is in Paris"],
    batch_size=32,
    convert_to_numpy=True,
    normalize_embeddings=True,
)
print(embeddings.shape, embeddings)
scores = (embeddings @ embeddings.T) * 100

# Hint: you can also host this code via REST API and
# via github.com/michaelfeil/infinity  

Checkpoint compatible to ctranslate2>=3.17.1 and hf-hub-ctranslate2>=2.12.0

  • compute_type=int8_float16 for device="cuda"
  • compute_type=int8 for device="cpu"

Converted on 2023-10-13 using

LLama-2 -> removed <pad> token.

Licence and other remarks:

This is just a quantized version. Licence conditions are intended to be idential to original huggingface repo.

Original description

Runs of michaelfeil ct2fast-all-MiniLM-L12-v2 on huggingface.co

4
Total runs
-2
24-hour runs
-2
3-day runs
-2
7-day runs
-1
30-day runs

More Information About ct2fast-all-MiniLM-L12-v2 huggingface.co Model

ct2fast-all-MiniLM-L12-v2 huggingface.co

ct2fast-all-MiniLM-L12-v2 huggingface.co is an AI model on huggingface.co that provides ct2fast-all-MiniLM-L12-v2's model effect (), which can be used instantly with this michaelfeil ct2fast-all-MiniLM-L12-v2 model. huggingface.co supports a free trial of the ct2fast-all-MiniLM-L12-v2 model, and also provides paid use of the ct2fast-all-MiniLM-L12-v2. Support call ct2fast-all-MiniLM-L12-v2 model through api, including Node.js, Python, http.

ct2fast-all-MiniLM-L12-v2 huggingface.co Url

https://huggingface.co/michaelfeil/ct2fast-all-MiniLM-L12-v2

michaelfeil ct2fast-all-MiniLM-L12-v2 online free

ct2fast-all-MiniLM-L12-v2 huggingface.co is an online trial and call api platform, which integrates ct2fast-all-MiniLM-L12-v2's modeling effects, including api services, and provides a free online trial of ct2fast-all-MiniLM-L12-v2, you can try ct2fast-all-MiniLM-L12-v2 online for free by clicking the link below.

michaelfeil ct2fast-all-MiniLM-L12-v2 online free url in huggingface.co:

https://huggingface.co/michaelfeil/ct2fast-all-MiniLM-L12-v2

ct2fast-all-MiniLM-L12-v2 install

ct2fast-all-MiniLM-L12-v2 is an open source model from GitHub that offers a free installation service, and any user can find ct2fast-all-MiniLM-L12-v2 on GitHub to install. At the same time, huggingface.co provides the effect of ct2fast-all-MiniLM-L12-v2 install, users can directly use ct2fast-all-MiniLM-L12-v2 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

ct2fast-all-MiniLM-L12-v2 install url in huggingface.co:

https://huggingface.co/michaelfeil/ct2fast-all-MiniLM-L12-v2

Url of ct2fast-all-MiniLM-L12-v2

ct2fast-all-MiniLM-L12-v2 huggingface.co Url

Provider of ct2fast-all-MiniLM-L12-v2 huggingface.co

michaelfeil
ORGANIZATIONS

Other API from michaelfeil