Exscientia / IgBert

huggingface.co
Total runs: 18.0K
24-hour runs: 566
7-day runs: 2.6K
30-day runs: 9.4K
Model's Last Updated: June 20 2024
fill-mask

Introduction of IgBert

Model Details of IgBert

IgBert

Model pretrained on protein and antibody sequences using a masked language modeling (MLM) objective. It was introduced in the paper Large scale paired antibody language models .

The model is finetuned from IgBert-unpaired using paired antibody sequences from the Observed Antibody Space .

Use

The model and tokeniser can be loaded using the transformers library

from transformers import BertModel, BertTokenizer

tokeniser = BertTokenizer.from_pretrained("Exscientia/IgBert", do_lower_case=False)
model = BertModel.from_pretrained("Exscientia/IgBert", add_pooling_layer=False)

The tokeniser is used to prepare batch inputs

# heavy chain sequences
sequences_heavy = [
    "VQLAQSGSELRKPGASVKVSCDTSGHSFTSNAIHWVRQAPGQGLEWMGWINTDTGTPTYAQGFTGRFVFSLDTSARTAYLQISSLKADDTAVFYCARERDYSDYFFDYWGQGTLVTVSS",
    "QVQLVESGGGVVQPGRSLRLSCAASGFTFSNYAMYWVRQAPGKGLEWVAVISYDGSNKYYADSVKGRFTISRDNSKNTLYLQMNSLRTEDTAVYYCASGSDYGDYLLVYWGQGTLVTVSS"
]

# light chain sequences
sequences_light = [
    "EVVMTQSPASLSVSPGERATLSCRARASLGISTDLAWYQQRPGQAPRLLIYGASTRATGIPARFSGSGSGTEFTLTISSLQSEDSAVYYCQQYSNWPLTFGGGTKVEIK",
    "ALTQPASVSGSPGQSITISCTGTSSDVGGYNYVSWYQQHPGKAPKLMIYDVSKRPSGVSNRFSGSKSGNTASLTISGLQSEDEADYYCNSLTSISTWVFGGGTKLTVL"
]

# The tokeniser expects input of the form ["V Q ... S S [SEP] E V ... I K", ...]
paired_sequences = []
for sequence_heavy, sequence_light in zip(sequences_heavy, sequences_light):
    paired_sequences.append(' '.join(sequence_heavy)+' [SEP] '+' '.join(sequence_light))

tokens = tokeniser.batch_encode_plus(
    paired_sequences, 
    add_special_tokens=True, 
    pad_to_max_length=True, 
    return_tensors="pt",
    return_special_tokens_mask=True
)

Note that the tokeniser adds a [CLS] token at the beginning of each paired sequence, a [SEP] token at the end of each paired sequence and pads using the [PAD] token. For example a batch containing sequences V Q L [SEP] E V V , Q V [SEP] A L will be tokenised to [CLS] V Q L [SEP] E V V [SEP] and [CLS] Q V [SEP] A L [SEP] [PAD] [PAD] .

Sequence embeddings are generated by feeding tokens through the model

output = model(
    input_ids=tokens['input_ids'], 
    attention_mask=tokens['attention_mask']
)

residue_embeddings = output.last_hidden_state

To obtain a sequence representation, the residue tokens can be averaged over like so

import torch

# mask special tokens before summing over embeddings
residue_embeddings[tokens["special_tokens_mask"] == 1] = 0
sequence_embeddings_sum = residue_embeddings.sum(1)

# average embedding by dividing sum by sequence lengths
sequence_lengths = torch.sum(tokens["special_tokens_mask"] == 0, dim=1)
sequence_embeddings = sequence_embeddings_sum / sequence_lengths.unsqueeze(1)

For sequence level fine-tuning the model can be loaded with a pooling head by setting add_pooling_layer=True and using output.pooler_output in the down-stream task.

Runs of Exscientia IgBert on huggingface.co

18.0K
Total runs
566
24-hour runs
1.3K
3-day runs
2.6K
7-day runs
9.4K
30-day runs

More Information About IgBert huggingface.co Model

More IgBert license Visit here:

https://choosealicense.com/licenses/mit

IgBert huggingface.co

IgBert huggingface.co is an AI model on huggingface.co that provides IgBert's model effect (), which can be used instantly with this Exscientia IgBert model. huggingface.co supports a free trial of the IgBert model, and also provides paid use of the IgBert. Support call IgBert model through api, including Node.js, Python, http.

Exscientia IgBert online free

IgBert huggingface.co is an online trial and call api platform, which integrates IgBert's modeling effects, including api services, and provides a free online trial of IgBert, you can try IgBert online for free by clicking the link below.

Exscientia IgBert online free url in huggingface.co:

https://huggingface.co/Exscientia/IgBert

IgBert install

IgBert is an open source model from GitHub that offers a free installation service, and any user can find IgBert on GitHub to install. At the same time, huggingface.co provides the effect of IgBert install, users can directly use IgBert installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

IgBert install url in huggingface.co:

https://huggingface.co/Exscientia/IgBert

Url of IgBert

Provider of IgBert huggingface.co

Exscientia
ORGANIZATIONS

Other API from Exscientia

huggingface.co

Total runs: 8.6K
Run Growth: 7.4K
Growth Rate: 86.12%
Updated: June 14 2024