Introduction of vit_base_patch32_224.augreg_in21k_ft_in1k
Model Details of vit_base_patch32_224.augreg_in21k_ft_in1k
Model card for vit_base_patch32_224.augreg_in21k_ft_in1k
A Vision Transformer (ViT) image classification model. Trained on ImageNet-21k and fine-tuned on ImageNet-1k (with additional augmentation and regularization) in JAX by paper authors, ported to PyTorch by Ross Wightman.
Model Details
Model Type:
Image classification / feature backbone
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model('vit_base_patch32_224.augreg_in21k_ft_in1k', pretrained=True)
model = model.eval()
# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0)) # unsqueeze single image into batch of 1
top5_probabilities, top5_class_indices = torch.topk(output.softmax(dim=1) * 100, k=5)
Image Embeddings
from urllib.request import urlopen
from PIL import Image
import timm
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
model = timm.create_model(
'vit_base_patch32_224.augreg_in21k_ft_in1k',
pretrained=True,
num_classes=0, # remove classifier nn.Linear
)
model = model.eval()
# get model specific transforms (normalization, resize)
data_config = timm.data.resolve_model_data_config(model)
transforms = timm.data.create_transform(**data_config, is_training=False)
output = model(transforms(img).unsqueeze(0)) # output is (batch_size, num_features) shaped tensor# or equivalently (without needing to set num_classes=0)
output = model.forward_features(transforms(img).unsqueeze(0))
# output is unpooled, a (1, 50, 768) shaped tensor
output = model.forward_head(output, pre_logits=True)
# output is a (1, num_features) shaped tensor
Model Comparison
Explore the dataset and runtime metrics of this model in timm
model results
.
Citation
@article{steiner2021augreg,
title={How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers},
author={Steiner, Andreas and Kolesnikov, Alexander and and Zhai, Xiaohua and Wightman, Ross and Uszkoreit, Jakob and Beyer, Lucas},
journal={arXiv preprint arXiv:2106.10270},
year={2021}
}
@article{dosovitskiy2020vit,
title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
journal={ICLR},
year={2021}
}
@misc{rw2019timm,
author = {Ross Wightman},
title = {PyTorch Image Models},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
doi = {10.5281/zenodo.4414861},
howpublished = {\url{https://github.com/huggingface/pytorch-image-models}}
}
Runs of timm vit_base_patch32_224.augreg_in21k_ft_in1k on huggingface.co
73.4K
Total runs
0
24-hour runs
200
3-day runs
-141
7-day runs
65.8K
30-day runs
More Information About vit_base_patch32_224.augreg_in21k_ft_in1k huggingface.co Model
More vit_base_patch32_224.augreg_in21k_ft_in1k license Visit here:
vit_base_patch32_224.augreg_in21k_ft_in1k huggingface.co is an AI model on huggingface.co that provides vit_base_patch32_224.augreg_in21k_ft_in1k's model effect (), which can be used instantly with this timm vit_base_patch32_224.augreg_in21k_ft_in1k model. huggingface.co supports a free trial of the vit_base_patch32_224.augreg_in21k_ft_in1k model, and also provides paid use of the vit_base_patch32_224.augreg_in21k_ft_in1k. Support call vit_base_patch32_224.augreg_in21k_ft_in1k model through api, including Node.js, Python, http.
vit_base_patch32_224.augreg_in21k_ft_in1k huggingface.co is an online trial and call api platform, which integrates vit_base_patch32_224.augreg_in21k_ft_in1k's modeling effects, including api services, and provides a free online trial of vit_base_patch32_224.augreg_in21k_ft_in1k, you can try vit_base_patch32_224.augreg_in21k_ft_in1k online for free by clicking the link below.
timm vit_base_patch32_224.augreg_in21k_ft_in1k online free url in huggingface.co:
vit_base_patch32_224.augreg_in21k_ft_in1k is an open source model from GitHub that offers a free installation service, and any user can find vit_base_patch32_224.augreg_in21k_ft_in1k on GitHub to install. At the same time, huggingface.co provides the effect of vit_base_patch32_224.augreg_in21k_ft_in1k install, users can directly use vit_base_patch32_224.augreg_in21k_ft_in1k installed effect in huggingface.co for debugging and trial. It also supports api for free installation.
vit_base_patch32_224.augreg_in21k_ft_in1k install url in huggingface.co: