google / owlvit-large-patch14

huggingface.co
Total runs: 23.8K
24-hour runs: -75
7-day runs: -6.9K
30-day runs: -11.8K
Model's Last Updated: December 12 2023
zero-shot-object-detection

Introduction of owlvit-large-patch14

Model Details of owlvit-large-patch14

Model Card: OWL-ViT

Model Details

The OWL-ViT (short for Vision Transformer for Open-World Localization) was proposed in Simple Open-Vocabulary Object Detection with Vision Transformers by Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, and Neil Houlsby. OWL-ViT is a zero-shot text-conditioned object detection model that can be used to query an image with one or multiple text queries.

OWL-ViT uses CLIP as its multi-modal backbone, with a ViT-like Transformer to get visual features and a causal language model to get the text features. To use CLIP for detection, OWL-ViT removes the final token pooling layer of the vision model and attaches a lightweight classification and box head to each transformer output token. Open-vocabulary classification is enabled by replacing the fixed classification layer weights with the class-name embeddings obtained from the text model. The authors first train CLIP from scratch and fine-tune it end-to-end with the classification and box heads on standard detection datasets using a bipartite matching loss. One or multiple text queries per image can be used to perform zero-shot text-conditioned object detection.

Model Date

May 2022

Model Type

The model uses a CLIP backbone with a ViT-L/14 Transformer architecture as an image encoder and uses a masked self-attention Transformer as a text encoder. These encoders are trained to maximize the similarity of (image, text) pairs via a contrastive loss. The CLIP backbone is trained from scratch and fine-tuned together with the box and class prediction heads with an object detection objective.

Documents
Use with Transformers
import requests
from PIL import Image
import torch

from transformers import OwlViTProcessor, OwlViTForObjectDetection

processor = OwlViTProcessor.from_pretrained("google/owlvit-large-patch14")
model = OwlViTForObjectDetection.from_pretrained("google/owlvit-large-patch14")

url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
texts = [["a photo of a cat", "a photo of a dog"]]
inputs = processor(text=texts, images=image, return_tensors="pt")
outputs = model(**inputs)

# Target image sizes (height, width) to rescale box predictions [batch_size, 2]
target_sizes = torch.Tensor([image.size[::-1]])
# Convert outputs (bounding boxes and class logits) to COCO API
results = processor.post_process_object_detection(outputs=outputs, threshold=0.1, target_sizes=target_sizes)

i = 0  # Retrieve predictions for the first image for the corresponding text queries
text = texts[i]
boxes, scores, labels = results[i]["boxes"], results[i]["scores"], results[i]["labels"]

# Print detected objects and rescaled box coordinates
for box, score, label in zip(boxes, scores, labels):
    box = [round(i, 2) for i in box.tolist()]
    print(f"Detected {text[label]} with confidence {round(score.item(), 3)} at location {box}")
Model Use
Intended Use

The model is intended as a research output for research communities. We hope that this model will enable researchers to better understand and explore zero-shot, text-conditioned object detection. We also hope it can be used for interdisciplinary studies of the potential impact of such models, especially in areas that commonly require identifying objects whose label is unavailable during training.

Primary intended uses

The primary intended users of these models are AI researchers.

We primarily imagine the model will be used by researchers to better understand robustness, generalization, and other capabilities, biases, and constraints of computer vision models.

Data

The CLIP backbone of the model was trained on publicly available image-caption data. This was done through a combination of crawling a handful of websites and using commonly-used pre-existing image datasets such as YFCC100M . A large portion of the data comes from our crawling of the internet. This means that the data is more representative of people and societies most connected to the internet. The prediction heads of OWL-ViT, along with the CLIP backbone, are fine-tuned on publicly available object detection datasets such as COCO and OpenImages .

BibTeX entry and citation info
@article{minderer2022simple,
  title={Simple Open-Vocabulary Object Detection with Vision Transformers},
  author={Matthias Minderer, Alexey Gritsenko, Austin Stone, Maxim Neumann, Dirk Weissenborn, Alexey Dosovitskiy, Aravindh Mahendran, Anurag Arnab, Mostafa Dehghani, Zhuoran Shen, Xiao Wang, Xiaohua Zhai, Thomas Kipf, Neil Houlsby},
  journal={arXiv preprint arXiv:2205.06230},
  year={2022},
}

Runs of google owlvit-large-patch14 on huggingface.co

23.8K
Total runs
-75
24-hour runs
-766
3-day runs
-6.9K
7-day runs
-11.8K
30-day runs

More Information About owlvit-large-patch14 huggingface.co Model

More owlvit-large-patch14 license Visit here:

https://choosealicense.com/licenses/apache-2.0

owlvit-large-patch14 huggingface.co

owlvit-large-patch14 huggingface.co is an AI model on huggingface.co that provides owlvit-large-patch14's model effect (), which can be used instantly with this google owlvit-large-patch14 model. huggingface.co supports a free trial of the owlvit-large-patch14 model, and also provides paid use of the owlvit-large-patch14. Support call owlvit-large-patch14 model through api, including Node.js, Python, http.

owlvit-large-patch14 huggingface.co Url

https://huggingface.co/google/owlvit-large-patch14

google owlvit-large-patch14 online free

owlvit-large-patch14 huggingface.co is an online trial and call api platform, which integrates owlvit-large-patch14's modeling effects, including api services, and provides a free online trial of owlvit-large-patch14, you can try owlvit-large-patch14 online for free by clicking the link below.

google owlvit-large-patch14 online free url in huggingface.co:

https://huggingface.co/google/owlvit-large-patch14

owlvit-large-patch14 install

owlvit-large-patch14 is an open source model from GitHub that offers a free installation service, and any user can find owlvit-large-patch14 on GitHub to install. At the same time, huggingface.co provides the effect of owlvit-large-patch14 install, users can directly use owlvit-large-patch14 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.

owlvit-large-patch14 install url in huggingface.co:

https://huggingface.co/google/owlvit-large-patch14

Url of owlvit-large-patch14

owlvit-large-patch14 huggingface.co Url

Provider of owlvit-large-patch14 huggingface.co

google
ORGANIZATIONS

Other API from google

huggingface.co

Total runs: 2.2M
Run Growth: -11.4M
Growth Rate: -527.13%
Updated: August 08 2024
huggingface.co

Total runs: 2.1M
Run Growth: -159.2K
Growth Rate: -7.43%
Updated: January 25 2023
huggingface.co

Total runs: 1.7M
Run Growth: -1.7M
Growth Rate: -97.41%
Updated: February 29 2024
huggingface.co

Total runs: 1.5M
Run Growth: -78.3K
Growth Rate: -5.39%
Updated: April 29 2024
huggingface.co

Total runs: 1.4M
Run Growth: 539.6K
Growth Rate: 38.86%
Updated: January 25 2023
huggingface.co

Total runs: 1.3M
Run Growth: 241.2K
Growth Rate: 18.23%
Updated: July 17 2023
huggingface.co

Total runs: 750.1K
Run Growth: 112.0K
Growth Rate: 14.93%
Updated: August 28 2024
huggingface.co

Total runs: 641.9K
Run Growth: 117.7K
Growth Rate: 18.33%
Updated: July 17 2023
huggingface.co

Total runs: 635.2K
Run Growth: 82.3K
Growth Rate: 12.96%
Updated: July 27 2023
huggingface.co

Total runs: 623.4K
Run Growth: 411.5K
Growth Rate: 66.01%
Updated: August 14 2024
huggingface.co

Total runs: 600.3K
Run Growth: 585.1K
Growth Rate: 97.46%
Updated: August 08 2024
huggingface.co

Total runs: 507.3K
Run Growth: 192.7K
Growth Rate: 37.99%
Updated: October 11 2023
huggingface.co

Total runs: 396.4K
Run Growth: -124.5K
Growth Rate: -31.40%
Updated: September 27 2024
huggingface.co

Total runs: 317.0K
Run Growth: -125.8K
Growth Rate: -39.69%
Updated: August 28 2024
huggingface.co

Total runs: 568.6K
Run Growth: 286.9K
Growth Rate: 99.63%
Updated: August 03 2023
huggingface.co

Total runs: 233.6K
Run Growth: 110.5K
Growth Rate: 47.33%
Updated: January 25 2023
huggingface.co

Total runs: 215.0K
Run Growth: -25.7K
Growth Rate: -11.95%
Updated: August 28 2024
huggingface.co

Total runs: 213.3K
Run Growth: -395.1K
Growth Rate: -185.19%
Updated: January 25 2023
huggingface.co

Total runs: 208.4K
Run Growth: 66.4K
Growth Rate: 31.85%
Updated: November 07 2023
huggingface.co

Total runs: 192.1K
Run Growth: 54.9K
Growth Rate: 28.57%
Updated: November 28 2023
huggingface.co

Total runs: 153.1K
Run Growth: -4.5K
Growth Rate: -2.94%
Updated: January 25 2023
huggingface.co

Total runs: 146.1K
Run Growth: -46.2K
Growth Rate: -31.63%
Updated: September 07 2023
huggingface.co

Total runs: 129.8K
Run Growth: -206.2K
Growth Rate: -158.89%
Updated: June 27 2024
huggingface.co

Total runs: 121.3K
Run Growth: -5.4K
Growth Rate: -4.44%
Updated: September 18 2023
huggingface.co

Total runs: 111.2K
Run Growth: 26.3K
Growth Rate: 23.62%
Updated: January 25 2023
huggingface.co

Total runs: 101.7K
Run Growth: -215.2K
Growth Rate: -211.67%
Updated: January 25 2023
huggingface.co

Total runs: 98.8K
Run Growth: 22.3K
Growth Rate: 22.57%
Updated: January 25 2023
huggingface.co

Total runs: 93.5K
Run Growth: 1.2K
Growth Rate: 1.26%
Updated: September 27 2024
huggingface.co

Total runs: 76.3K
Run Growth: -89.4K
Growth Rate: -117.09%
Updated: August 08 2024
huggingface.co

Total runs: 64.1K
Run Growth: 6.0K
Growth Rate: 9.28%
Updated: January 25 2023
huggingface.co

Total runs: 62.5K
Run Growth: 11.9K
Growth Rate: 19.10%
Updated: January 25 2023
huggingface.co

Total runs: 44.3K
Run Growth: 26.3K
Growth Rate: 59.25%
Updated: January 25 2023
huggingface.co

Total runs: 43.2K
Run Growth: 41.4K
Growth Rate: 95.95%
Updated: November 29 2021
huggingface.co

Total runs: 26.1K
Run Growth: -12.6K
Growth Rate: -48.26%
Updated: November 27 2023
huggingface.co

Total runs: 23.2K
Run Growth: 620
Growth Rate: 2.68%
Updated: January 25 2023
huggingface.co

Total runs: 11.6K
Run Growth: 1.8K
Growth Rate: 17.34%
Updated: July 06 2023
huggingface.co

Total runs: 9.9K
Run Growth: 3.1K
Growth Rate: 30.72%
Updated: April 29 2024
huggingface.co

Total runs: 8.9K
Run Growth: -31.2K
Growth Rate: -351.34%
Updated: September 07 2023
huggingface.co

Total runs: 7.8K
Run Growth: -4.0K
Growth Rate: -51.52%
Updated: January 25 2023