minicpm-llama3-v-2.5 replicate.com api & cuuupid minicpm-llama3-v-2.5 github AI Model

Introduction of minicpm-llama3-v-2.5

Model Details of minicpm-llama3-v-2.5

Readme

All credit to OpenBMB, check this model out on GitHub !

MiniCPM-Llama3-V 2.5

MiniCPM-Llama3-V 2.5 is the latest model in the MiniCPM-V series. The model is built on SigLip-400M and Llama3-8B-Instruct with a total of 8B parameters. It exhibits a significant performance improvement over MiniCPM-V 2.0. Notable features of MiniCPM-Llama3-V 2.5 include:

🔥 Leading Performance. MiniCPM-Llama3-V 2.5 has achieved an average score of 65.1 on OpenCompass, a comprehensive evaluation over 11 popular benchmarks. With only 8B parameters, it surpasses widely used proprietary models like GPT-4V-1106, Gemini Pro, Claude 3 and Qwen-VL-Max and greatly outperforms other Llama 3-based MLLMs.
💪 Strong OCR Capabilities. MiniCPM-Llama3-V 2.5 can process images with any aspect ratio and up to 1.8 million pixels (e.g., 1344x1344), achieving a 700+ score on OCRBench, surpassing proprietary models such as GPT-4o, GPT-4V-0409, Qwen-VL-Max and Gemini Pro . Based on recent user feedback, MiniCPM-Llama3-V 2.5 has now enhanced full-text OCR extraction, table-to-markdown conversion, and other high-utility capabilities, and has further strengthened its instruction-following and complex reasoning abilities, enhancing multimodal interaction experiences.
🏆 Trustworthy Behavior. Leveraging the latest RLAIF-V method (the newest technique in the RLHF-V [CVPR‘24] series), MiniCPM-Llama3-V 2.5 exhibits more trustworthy behavior. It achieves a 10.3% hallucination rate on Object HalBench, lower than GPT-4V-1106 (13.6%), achieving the best-level performance within the open-source community. Data released .
🌏 Multilingual Support. Thanks to the strong multilingual capabilities of Llama 3 and the cross-lingual generalization technique from VisCPM , MiniCPM-Llama3-V 2.5 extends its bilingual (Chinese-English) multimodal capabilities to over 30 languages including German, French, Spanish, Italian, Korean etc. All Supported Languages .
🚀 Efficient Deployment. MiniCPM-Llama3-V 2.5 systematically employs model quantization, CPU optimizations, NPU optimizations and compilation optimizations , achieving high-efficiency deployment on end-side devices. For mobile phones with Qualcomm chips, we have integrated the NPU acceleration framework QNN into llama.cpp for the first time. After systematic optimization, MiniCPM-Llama3-V 2.5 has realized a 150x acceleration in end-side MLLM image encoding and a 3x speedup in language decoding .
💫 Easy Usage. MiniCPM-Llama3-V 2.5 can be easily used in various ways: (1) llama.cpp and ollama support for efficient CPU inference on local devices, (2) GGUF format quantized models in 16 sizes, (3) efficient LoRA fine-tuning with only 2 V100 GPUs, (4) streaming output , (5) quick local WebUI demo setup with Gradio and Streamlit , and (6) interactive demos on HuggingFace Spaces .

Citation

If you find our model/code/paper helpful, please consider cite our papers 📝 and star us ⭐️！

@article{yu2023rlhf,
  title={Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback},
  author={Yu, Tianyu and Yao, Yuan and Zhang, Haoye and He, Taiwen and Han, Yifeng and Cui, Ganqu and Hu, Jinyi and Liu, Zhiyuan and Zheng, Hai-Tao and Sun, Maosong and others},
  journal={arXiv preprint arXiv:2312.00849},
  year={2023}
}
@article{viscpm,
    title={Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages}, 
    author={Jinyi Hu and Yuan Yao and Chongyi Wang and Shan Wang and Yinxu Pan and Qianyu Chen and Tianyu Yu and Hanghao Wu and Yue Zhao and Haoye Zhang and Xu Han and Yankai Lin and Jiao Xue and Dahai Li and Zhiyuan Liu and Maosong Sun},
    journal={arXiv preprint arXiv:2308.12038},
    year={2023}
}
@article{xu2024llava-uhd,
  title={{LLaVA-UHD}: an LMM Perceiving Any Aspect Ratio and High-Resolution Images},
  author={Xu, Ruyi and Yao, Yuan and Guo, Zonghao and Cui, Junbo and Ni, Zanlin and Ge, Chunjiang and Chua, Tat-Seng and Liu, Zhiyuan and Huang, Gao},
  journal={arXiv preprint arXiv:2403.11703},
  year={2024}
}
@article{yu2024rlaifv,
  title={RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness}, 
  author={Yu, Tianyu and Zhang, Haoye and Yao, Yuan and Dang, Yunkai and Chen, Da and Lu, Xiaoman and Cui, Ganqu and He, Taiwen and Liu, Zhiyuan and Chua, Tat-Seng and Sun, Maosong},
  journal={arXiv preprint arXiv:2405.17220},
  year={2024}
}

Pricing of minicpm-llama3-v-2.5 replicate.com

Run time and cost

This model runs on Nvidia T4 GPU hardware . We don't yet have enough runs of this model to provide performance information.

Runs of cuuupid minicpm-llama3-v-2.5 on replicate.com

127

Total runs

24-hour runs

3-day runs

7-day runs

30-day runs

More Information About minicpm-llama3-v-2.5 replicate.com Model

More minicpm-llama3-v-2.5 license Visit here:

https://github.com/OpenBMB/MiniCPM-V/blob/main/LICENSE

minicpm-llama3-v-2.5 replicate.com

minicpm-llama3-v-2.5 replicate.com is an AI model on replicate.com that provides minicpm-llama3-v-2.5's model effect (MiniCPM LLama3-V 2.5, a new SOTA open-source VLM that surpasses GPT-4V-1106 and Phi-128k on a number of benchmarks.), which can be used instantly with this cuuupid minicpm-llama3-v-2.5 model. replicate.com supports a free trial of the minicpm-llama3-v-2.5 model, and also provides paid use of the minicpm-llama3-v-2.5. Support call minicpm-llama3-v-2.5 model through api, including Node.js, Python, http.

minicpm-llama3-v-2.5 replicate.com Url

https://replicate.com/cuuupid/minicpm-llama3-v-2.5

cuuupid minicpm-llama3-v-2.5 online free

minicpm-llama3-v-2.5 replicate.com is an online trial and call api platform, which integrates minicpm-llama3-v-2.5's modeling effects, including api services, and provides a free online trial of minicpm-llama3-v-2.5, you can try minicpm-llama3-v-2.5 online for free by clicking the link below.

cuuupid minicpm-llama3-v-2.5 online free url in replicate.com:

https://replicate.com/cuuupid/minicpm-llama3-v-2.5

minicpm-llama3-v-2.5 install

minicpm-llama3-v-2.5 is an open source model from GitHub that offers a free installation service, and any user can find minicpm-llama3-v-2.5 on GitHub to install. At the same time, replicate.com provides the effect of minicpm-llama3-v-2.5 install, users can directly use minicpm-llama3-v-2.5 installed effect in replicate.com for debugging and trial. It also supports api for free installation.

minicpm-llama3-v-2.5 install url in replicate.com:

https://replicate.com/cuuupid/minicpm-llama3-v-2.5

cuuupid / minicpm-llama3-v-2.5

Introduction of minicpm-llama3-v-2.5

Model Details of minicpm-llama3-v-2.5

Readme

MiniCPM-Llama3-V 2.5

Citation

Pricing of minicpm-llama3-v-2.5 replicate.com

Run time and cost

Runs of cuuupid minicpm-llama3-v-2.5 on replicate.com

More Information About minicpm-llama3-v-2.5 replicate.com Model

More minicpm-llama3-v-2.5 license Visit here:

minicpm-llama3-v-2.5 replicate.com

minicpm-llama3-v-2.5 replicate.com Url

cuuupid minicpm-llama3-v-2.5 online free

cuuupid minicpm-llama3-v-2.5 online free url in replicate.com:

minicpm-llama3-v-2.5 install

minicpm-llama3-v-2.5 install url in replicate.com:

minicpm-llama3-v-2.5 install url in github:

Url of minicpm-llama3-v-2.5

minicpm-llama3-v-2.5 replicate.com Url

minicpm-llama3-v-2.5 Github

minicpm-llama3-v-2.5 Owner Github

Provider of minicpm-llama3-v-2.5 replicate.com

Other API from cuuupid