git pull
pip install -e .
# if you see some import errors when you upgrade, please try running the command below (without #)# pip install flash-attn --no-build-isolation --no-cache-dir
from tinyllava.model.builder import load_pretrained_model
from tinyllava.mm_utils import get_model_name_from_path
from tinyllava.eval.run_tiny_llava import eval_model
model_path = "bczhou/TinyLLaVA-3.1B"
prompt = "What are the things I should be cautious about when I visit here?"
image_file = "https://llava-vl.github.io/static/images/view.jpg"
args = type('Args', (), {
"model_path": model_path,
"model_base": None,
"model_name": get_model_name_from_path(model_path),
"query": prompt,
"conv_mode": "phi",
"image_file": image_file,
"sep": ",",
"temperature": 0,
"top_p": None,
"num_beams": 1,
"max_new_tokens": 512
})()
eval_model(args)
Important
We use different
conv_mode
for different models. Replace the
conv_mode
in
args
according to this table:
| model | conv_mode |
|---------------- |----------- |
| TinyLLaVA-3.1B | phi |
| TinyLLaVA-2.0B | phi |
| TinyLLaVA-1.5B | v1 |
Evaluation
To ensure the reproducibility, we evaluate the models with greedy decoding.
In our paper, we used two different datasets: the
LLaVA dataset
and the
ShareGPT4V dataset
, and compared their differences. In this section, we provide information on data preparation.
Pretraining Images
LLaVA: The pretraining images of LLaVA is from the 558K subset of the LAION-CC-SBU dataset.
ShareGPT4V: The pretraining images of ShareGPT4V is a mixture of 558K LAION-CC-SBU subset, SAM dataset, and COCO dataset.
Pretraining Annotations
LLaVA: The pretraining annotations of LLaVA are
here
.
ShareGPT4V: The pretraining annotations of ShareGPT4V are
here
.
SFT Images & Annotations
The majority of the two SFT datasets are the same, with the exception that the 23K detailed description data in LLaVA-1.5-SFT being replaced with detailed captions randomly sampled from the
100K ShareGPT4V data
.
SAM: This dataset is collected by
Meta
. Download:
images
. We only use 000000~000050.tar for now. If you just want to use ShareGPT4V for SFT, you can quickly download 9K images from
here
.
The model supports multi-image and multi-prompt generation. When using the model, make sure to follow the correct prompt template (
USER: <image>xxx\nASSISTANT:
), where
<image>
token is a place-holding special token for image embeddings.
If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil:.
@misc{zhou2024tinyllava,
title={TinyLLaVA: A Framework of Small-scale Large Multimodal Models},
author={Baichuan Zhou and Ying Hu and Xi Weng and Junlong Jia and Jie Luo and Xien Liu and Ji Wu and Lei Huang},
year={2024},
eprint={2402.14289},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
โค๏ธ Community efforts
Our codebase is built upon the
LLaVA
project. Great work!
Our project uses data from the
ShareGPT4V
project. Great work!
Runs of bczhou tiny-llava-v1-hf on huggingface.co
13.2K
Total runs
2.9K
24-hour runs
6.7K
3-day runs
6.4K
7-day runs
-545
30-day runs
More Information About tiny-llava-v1-hf huggingface.co Model
tiny-llava-v1-hf huggingface.co is an AI model on huggingface.co that provides tiny-llava-v1-hf's model effect (), which can be used instantly with this bczhou tiny-llava-v1-hf model. huggingface.co supports a free trial of the tiny-llava-v1-hf model, and also provides paid use of the tiny-llava-v1-hf. Support call tiny-llava-v1-hf model through api, including Node.js, Python, http.
tiny-llava-v1-hf huggingface.co is an online trial and call api platform, which integrates tiny-llava-v1-hf's modeling effects, including api services, and provides a free online trial of tiny-llava-v1-hf, you can try tiny-llava-v1-hf online for free by clicking the link below.
bczhou tiny-llava-v1-hf online free url in huggingface.co:
tiny-llava-v1-hf is an open source model from GitHub that offers a free installation service, and any user can find tiny-llava-v1-hf on GitHub to install. At the same time, huggingface.co provides the effect of tiny-llava-v1-hf install, users can directly use tiny-llava-v1-hf installed effect in huggingface.co for debugging and trial. It also supports api for free installation.