This project aims to provide a better Chinese CLIP model. The training data used in this project consists of publicly accessible image URLs and related Chinese text descriptions, totaling 400 million. After screening, we ultimately used 100 million data for training.
This project is produced by QQ-ARC Joint Lab, Tencent PCG. For more detailed information, please refer to the
main page of the QA-CLIP project
. We have also open-sourced our code on GitHub,
QA-CLIP
, and welcome to star!
Results
We conducted zero-shot tests on
MUGE Retrieval
,
Flickr30K-CN
, and
COCO-CN
datasets for image-text retrieval tasks. For the image zero-shot classification task, we tested on the ImageNet dataset. The test results are shown in the table below:
Flickr30K-CN Zero-shot Retrieval (Official Test Set)
:
QA-CLIP-ViT-B-16 huggingface.co is an AI model on huggingface.co that provides QA-CLIP-ViT-B-16's model effect (), which can be used instantly with this TencentARC QA-CLIP-ViT-B-16 model. huggingface.co supports a free trial of the QA-CLIP-ViT-B-16 model, and also provides paid use of the QA-CLIP-ViT-B-16. Support call QA-CLIP-ViT-B-16 model through api, including Node.js, Python, http.
QA-CLIP-ViT-B-16 huggingface.co is an online trial and call api platform, which integrates QA-CLIP-ViT-B-16's modeling effects, including api services, and provides a free online trial of QA-CLIP-ViT-B-16, you can try QA-CLIP-ViT-B-16 online for free by clicking the link below.
TencentARC QA-CLIP-ViT-B-16 online free url in huggingface.co:
QA-CLIP-ViT-B-16 is an open source model from GitHub that offers a free installation service, and any user can find QA-CLIP-ViT-B-16 on GitHub to install. At the same time, huggingface.co provides the effect of QA-CLIP-ViT-B-16 install, users can directly use QA-CLIP-ViT-B-16 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.