lucataco / florence-2-base

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

replicate.com
Total runs: 26.9K
24-hour runs: 600
7-day runs: 2.5K
30-day runs: 14.1K
Github
Model's Last Updated: June 26 2024

Introduction of florence-2-base

Model Details of florence-2-base

Readme

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Model Summary

This Hub repository contains a HuggingFace’s transformers implementation of Florence-2 model from Microsoft.

Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks. Florence-2 can interpret simple text prompts to perform tasks like captioning, object detection, and segmentation. It leverages our FLD-5B dataset, containing 5.4 billion annotations across 126 million images, to master multi-task learning. The model’s sequence-to-sequence architecture enables it to excel in both zero-shot and fine-tuned settings, proving to be a competitive vision foundation model.

Resources and Technical Documentation: + Florence-2 technical report . + Jupyter Notebook for inference and visualization of Florence-2-large model

Model Model size Model Description
Florence-2-base [HF] 0.23B Pretrained model with FLD-5B
Florence-2-large [HF] 0.77B Pretrained model with FLD-5B
Florence-2-base-ft [HF] 0.23B Finetuned model on a colletion of downstream tasks
Florence-2-large-ft [HF] 0.77B Finetuned model on a colletion of downstream tasks
Tasks

This model is capable of performing different tasks through changing the prompts.

Here are the tasks Florence-2 could perform:

Caption
prompt = "<CAPTION>"
run_example(prompt)
Detailed Caption
prompt = "<DETAILED_CAPTION>"
run_example(prompt)
More Detailed Caption
prompt = "<MORE_DETAILED_CAPTION>"
run_example(prompt)
Caption to Phrase Grounding

caption to phrase grounding task requires additional text input, i.e. caption.

Caption to phrase grounding results format: {‘\<CAPTION_TO_PHRASE_GROUNDING>‘: {‘bboxes’: [[x1, y1, x2, y2], …], ‘labels’: [‘’, ‘’, …]}}

task_prompt = "<CAPTION_TO_PHRASE_GROUNDING>"
results = run_example(task_prompt, text_input="A green car parked in front of a yellow building.")
Object Detection

OD results format: {‘\<OD>‘: {‘bboxes’: [[x1, y1, x2, y2], …], ‘labels’: [‘label1’, ‘label2’, …]} }

prompt = "<OD>"
run_example(prompt)
Dense Region Caption

Dense region caption results format: {‘\<DENSE_REGION_CAPTION>’ : {‘bboxes’: [[x1, y1, x2, y2], …], ‘labels’: [‘label1’, ‘label2’, …]} }

prompt = "<DENSE_REGION_CAPTION>"
run_example(prompt)
Region proposal

Dense region caption results format: {‘\<REGION_PROPOSAL>‘: {‘bboxes’: [[x1, y1, x2, y2], …], ‘labels’: [‘’, ‘’, …]}}

prompt = "<REGION_PROPOSAL>"
run_example(prompt)
OCR
prompt = "<OCR>"
run_example(prompt)
OCR with Region

OCR with region output format: {‘\<OCR_WITH_REGION>‘: {‘quad_boxes’: [[x1, y1, x2, y2, x3, y3, x4, y4], …], ‘labels’: [‘text1’, …]}}

prompt = "<OCR_WITH_REGION>"
run_example(prompt)

for More detailed examples, please refer to notebook </details>

Benchmarks

Florence-2 Zero-shot performance

The following table presents the zero-shot performance of generalist vision foundation models on image captioning and object detection evaluation tasks. These models have not been exposed to the training data of the evaluation tasks during their training phase.

Method #params COCO Cap. test CIDEr NoCaps val CIDEr TextCaps val CIDEr COCO Det. val2017 mAP
Flamingo 80B 84.3 - - -
Florence-2-base 0.23B 133.0 118.7 70.1 34.7
Florence-2-large 0.77B 135.6 120.8 72.8 37.5

The following table continues the comparison with performance on other vision-language evaluation tasks.

Method Flickr30k test R@1 Refcoco val Accuracy Refcoco test-A Accuracy Refcoco test-B Accuracy Refcoco+ val Accuracy Refcoco+ test-A Accuracy Refcoco+ test-B Accuracy Refcocog val Accuracy Refcocog test Accuracy Refcoco RES val mIoU
Kosmos-2 78.7 52.3 57.4 47.3 45.5 50.7 42.2 60.6 61.7 -
Florence-2-base 83.6 53.9 58.4 49.7 51.5 56.4 47.9 66.3 65.1 34.6
Florence-2-large 84.4 56.3 61.6 51.4 53.6 57.9 49.9 68.0 67.0 35.8
Florence-2 finetuned performance

We finetune Florence-2 models with a collection of downstream tasks, resulting two generalist models Florence-2-base-ft and Florence-2-large-ft that can conduct a wide range of downstream tasks.

The table below compares the performance of specialist and generalist models on various captioning and Visual Question Answering (VQA) tasks. Specialist models are fine-tuned specifically for each task, whereas generalist models are fine-tuned in a task-agnostic manner across all tasks. The symbol “▲” indicates the usage of external OCR as input.

Method # Params COCO Caption Karpathy test CIDEr NoCaps val CIDEr TextCaps val CIDEr VQAv2 test-dev Acc TextVQA test-dev Acc VizWiz VQA test-dev Acc
Specialist Models
CoCa 2.1B 143.6 122.4 - 82.3 - -
BLIP-2 7.8B 144.5 121.6 - 82.2 - -
GIT2 5.1B 145.0 126.9 148.6 81.7 67.3 71.0
Flamingo 80B 138.1 - - 82.0 54.1 65.7
PaLI 17B 149.1 127.0 160.0▲ 84.3 58.8 / 73.1▲ 71.6 / 74.4▲
PaLI-X 55B 149.2 126.3 147.0 / 163.7▲ 86.0 71.4 / 80.8▲ 70.9 / 74.6▲
Generalist Models
Unified-IO 2.9B - 100.0 - 77.9 - 57.4
Florence-2-base-ft 0.23B 140.0 116.7 143.9 79.7 63.6 63.6
Florence-2-large-ft 0.77B 143.3 124.9 151.1 81.7 73.5 72.6
Method # Params COCO Det. val2017 mAP Flickr30k test R@1 RefCOCO val Accuracy RefCOCO test-A Accuracy RefCOCO test-B Accuracy RefCOCO+ val Accuracy RefCOCO+ test-A Accuracy RefCOCO+ test-B Accuracy RefCOCOg val Accuracy RefCOCOg test Accuracy RefCOCO RES val mIoU
Specialist Models
SeqTR - - - 83.7 86.5 81.2 71.5 76.3 64.9 74.9 74.2 -
PolyFormer - - - 90.4 92.9 87.2 85.0 89.8 78.0 85.8 85.9 76.9
UNINEXT 0.74B 60.6 - 92.6 94.3 91.5 85.2 89.6 79.8 88.7 89.4 -
Ferret 13B - - 89.5 92.4 84.4 82.8 88.1 75.2 85.8 86.3 -
Generalist Models
UniTAB - - - 88.6 91.1 83.8 81.0 85.4 71.6 84.6 84.7 -
Florence-2-base-ft 0.23B 41.4 84.0 92.6 94.8 91.5 86.8 91.7 82.2 89.8 82.2 78.0
Florence-2-large-ft 0.77B 43.4 85.2 93.4 95.3 92.0 88.3 92.9 83.6 91.2 91.7 80.5
BibTex and citation info
@article{xiao2023florence,
  title={Florence-2: Advancing a unified representation for a variety of vision tasks},
  author={Xiao, Bin and Wu, Haiping and Xu, Weijian and Dai, Xiyang and Hu, Houdong and Lu, Yumao and Zeng, Michael and Liu, Ce and Yuan, Lu},
  journal={arXiv preprint arXiv:2311.06242},
  year={2023}
}

Pricing of florence-2-base replicate.com

Run time and cost

This model costs approximately $0.0051 to run on Replicate, or 196 runs per $1, but this varies depending on your inputs. It is also open source and you can run it on your own computer with Docker .

This model runs on Nvidia A40 GPU hardware . Predictions typically complete within 9 seconds. The predict time for this model varies significantly based on the inputs.

Runs of lucataco florence-2-base on replicate.com

26.9K
Total runs
600
24-hour runs
900
3-day runs
2.5K
7-day runs
14.1K
30-day runs

More Information About florence-2-base replicate.com Model

florence-2-base replicate.com

florence-2-base replicate.com is an AI model on replicate.com that provides florence-2-base's model effect (Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks), which can be used instantly with this lucataco florence-2-base model. replicate.com supports a free trial of the florence-2-base model, and also provides paid use of the florence-2-base. Support call florence-2-base model through api, including Node.js, Python, http.

florence-2-base replicate.com Url

https://replicate.com/lucataco/florence-2-base

lucataco florence-2-base online free

florence-2-base replicate.com is an online trial and call api platform, which integrates florence-2-base's modeling effects, including api services, and provides a free online trial of florence-2-base, you can try florence-2-base online for free by clicking the link below.

lucataco florence-2-base online free url in replicate.com:

https://replicate.com/lucataco/florence-2-base

florence-2-base install

florence-2-base is an open source model from GitHub that offers a free installation service, and any user can find florence-2-base on GitHub to install. At the same time, replicate.com provides the effect of florence-2-base install, users can directly use florence-2-base installed effect in replicate.com for debugging and trial. It also supports api for free installation.

florence-2-base install url in replicate.com:

https://replicate.com/lucataco/florence-2-base

florence-2-base install url in github:

https://github.com/lucataco/cog-florence-2-base

Url of florence-2-base

florence-2-base replicate.com Url

florence-2-base Owner Github

Provider of florence-2-base replicate.com

Other API from lucataco

replicate

Falcons.ai Fine-Tuned Vision Transformer (ViT) for NSFW Image Classification

Total runs: 4.5M
Run Growth: 0
Growth Rate: 0.00%
Updated: November 21 2023
replicate

Remove background from an image

Total runs: 4.1M
Run Growth: 1.0M
Growth Rate: 24.39%
Updated: September 15 2023
replicate

Implementation of Realistic Vision v5.1 with VAE

Total runs: 2.5M
Run Growth: 500.0K
Growth Rate: 20.00%
Updated: August 15 2023
replicate

FLUX.1-Dev LoRA Explorer

Total runs: 2.1M
Run Growth: 500.0K
Growth Rate: 23.81%
Updated: October 06 2024
replicate

SDXL ControlNet - Canny

Total runs: 1.9M
Run Growth: 200.0K
Growth Rate: 10.53%
Updated: October 04 2023
replicate

Juggernaut XL v9

Total runs: 1.3M
Run Growth: 100.0K
Growth Rate: 7.69%
Updated: February 29 2024
replicate

Turn any image into a video

Total runs: 1.3M
Run Growth: 0
Growth Rate: 0.00%
Updated: September 03 2023
replicate

SDXL Inpainting developed by the HF Diffusers team

Total runs: 1.1M
Run Growth: 194.6K
Growth Rate: 17.69%
Updated: March 06 2024
replicate

Segmind Stable Diffusion Model (SSD-1B) is a distilled 50% smaller version of SDXL, offering a 60% speedup while maintaining high-quality text-to-image generation capabilities

Total runs: 989.2K
Run Growth: 3.5K
Growth Rate: 0.35%
Updated: November 09 2023
replicate

Hyper FLUX 8-step by ByteDance

Total runs: 926.0K
Run Growth: 0
Growth Rate: 0.00%
Updated: August 28 2024
replicate

CLIP Interrogator for SDXL optimizes text prompts to match a given image

Total runs: 845.2K
Run Growth: 500
Growth Rate: 0.06%
Updated: May 17 2024
replicate

A multimodal LLM-based AI assistant, which is trained with alignment techniques. Qwen-VL-Chat supports more flexible interaction, such as multi-round question answering, and creative capabilities.

Total runs: 790.8K
Run Growth: 800
Growth Rate: 0.10%
Updated: October 15 2023
replicate

FLUX.1-Dev Multi LoRA Explorer

Total runs: 473.7K
Run Growth: 93.7K
Growth Rate: 19.78%
Updated: October 06 2024
replicate

SDXL v1.0 - A text-to-image generative AI model that creates beautiful images

Total runs: 468.2K
Run Growth: 10.6K
Growth Rate: 2.26%
Updated: November 02 2023
replicate

Coqui XTTS-v2: Multilingual Text To Speech Voice Cloning

Total runs: 419.1K
Run Growth: 59.6K
Growth Rate: 14.22%
Updated: November 28 2023
replicate

snowflake-arctic-embed is a suite of text embedding models that focuses on creating high-quality retrieval models optimized for performance

Total runs: 396.9K
Run Growth: 600
Growth Rate: 0.15%
Updated: April 20 2024
replicate

Latent Consistency Model (LCM): SDXL, distills the original model into a version that requires fewer steps (4 to 8 instead of the original 25 to 50)

Total runs: 394.0K
Run Growth: 700
Growth Rate: 0.18%
Updated: November 13 2023
replicate

FLUX.1-Schnell LoRA Explorer

Total runs: 359.5K
Run Growth: 138.4K
Growth Rate: 38.50%
Updated: September 07 2024
replicate

Monster Labs QrCode ControlNet on top of SD Realistic Vision v5.1

Total runs: 356.0K
Run Growth: 3.6K
Growth Rate: 1.01%
Updated: September 24 2023
replicate

Robust face restoration algorithm for old photos/AI-generated faces - (A40 GPU)

Total runs: 316.6K
Run Growth: 2.4K
Growth Rate: 0.76%
Updated: September 06 2023
replicate

RealvisXL-v2.0 with LCM LoRA - requires fewer steps (4 to 8 instead of the original 40 to 50)

Total runs: 290.1K
Run Growth: 700
Growth Rate: 0.24%
Updated: November 16 2023
replicate

Implementation of SDXL RealVisXL_V2.0

Total runs: 280.0K
Run Growth: 600
Growth Rate: 0.21%
Updated: November 09 2023
replicate

Animate Your Personalized Text-to-Image Diffusion Models

Total runs: 269.7K
Run Growth: 6.3K
Growth Rate: 2.34%
Updated: September 25 2023
replicate

😊 Hotshot-XL is an AI text-to-GIF model trained to work alongside Stable Diffusion XL

Total runs: 265.4K
Run Growth: 70.7K
Growth Rate: 26.64%
Updated: October 23 2023
replicate

moondream2 is a small vision language model designed to run efficiently on edge devices

Total runs: 214.9K
Run Growth: 8.9K
Growth Rate: 4.14%
Updated: July 29 2024
replicate

Practical face restoration algorithm for *old photos* or *AI-generated faces* (for larger images)

Total runs: 200.6K
Run Growth: 8.8K
Growth Rate: 4.39%
Updated: August 03 2023
replicate

DreamShaper is a general purpose SD model that aims at doing everything well, photos, art, anime, manga. It's designed to match Midjourney and DALL-E.

Total runs: 185.1K
Run Growth: 3.0K
Growth Rate: 1.62%
Updated: December 20 2023
replicate

A unique fusion that showcases exceptional prompt adherence and semantic understanding, it seems to be a step above base SDXL and a step closer to DALLE-3 in terms of prompt comprehension

Total runs: 121.5K
Run Growth: 9.0K
Growth Rate: 7.41%
Updated: December 27 2023
replicate

CLIP Interrogator (for faster inference)

Total runs: 119.3K
Run Growth: 500
Growth Rate: 0.42%
Updated: September 12 2023
replicate

Real-ESRGAN Video Upscaler

Total runs: 95.0K
Run Growth: 9.3K
Growth Rate: 9.79%
Updated: November 25 2023
replicate

dreamshaper-xl-lightning is a Stable Diffusion model that has been fine-tuned on SDXL

Total runs: 85.6K
Run Growth: 11.3K
Growth Rate: 13.20%
Updated: February 27 2024
replicate

Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets

Total runs: 81.7K
Run Growth: 0
Growth Rate: 0.00%
Updated: July 03 2024
replicate

SDXL_Niji_Special Edition

Total runs: 55.8K
Run Growth: 5.0K
Growth Rate: 8.96%
Updated: November 14 2023
replicate

Dreamshaper-7 img2img with LCM LoRA for faster inference

Total runs: 55.1K
Run Growth: 100
Growth Rate: 0.18%
Updated: November 17 2023
replicate

MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Total runs: 54.4K
Run Growth: 500
Growth Rate: 0.92%
Updated: December 05 2023
replicate

PixArt-Alpha 1024px is a transformer-based text-to-image diffusion system trained on text embeddings from T5

Total runs: 51.4K
Run Growth: 600
Growth Rate: 1.17%
Updated: December 04 2023
replicate

Implementation of SDXL RealVisXL_V1.0

Total runs: 44.0K
Run Growth: 0
Growth Rate: 0.00%
Updated: September 13 2023
replicate

SDXL Image Blending

Total runs: 42.4K
Run Growth: 0
Growth Rate: 0.00%
Updated: December 12 2023
replicate

Ostris AI-Toolkit for Flux LoRA Training (Proof of Concept). Please use the official trainer at: ostris/flux-dev-lora-trainer

Total runs: 40.5K
Run Growth: 6.2K
Growth Rate: 15.31%
Updated: August 18 2024
replicate

BakLLaVA-1 is a Mistral 7B base augmented with the LLaVA 1.5 architecture

Total runs: 38.9K
Run Growth: 100
Growth Rate: 0.26%
Updated: October 24 2023
replicate

lmsys/vicuna-13b-v1.3

Total runs: 38.4K
Run Growth: 0
Growth Rate: 0.00%
Updated: June 30 2023
replicate

(Academic and Non-commercial use only) Pixel-Aware Stable Diffusion for Realistic Image Super-resolution and Personalized Stylization

Total runs: 38.4K
Run Growth: 1.1K
Growth Rate: 2.86%
Updated: January 08 2024
replicate

Mistral-7B-v0.1 fine tuned for chat with the Dolphin dataset (an open-source implementation of Microsoft's Orca)

Total runs: 33.9K
Run Growth: 500
Growth Rate: 1.47%
Updated: October 31 2023
replicate

Gemma2 2b by Google

Total runs: 33.1K
Run Growth: 0
Growth Rate: 0.00%
Updated: August 01 2024
replicate

Real-ESRGAN with optional face correction and adjustable upscale (for larger images)

Total runs: 32.6K
Run Growth: 2.0K
Growth Rate: 6.13%
Updated: July 17 2023
replicate

The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate SDXL images with an image prompt

Total runs: 30.0K
Run Growth: 400
Growth Rate: 1.33%
Updated: November 12 2023
replicate

lmsys/vicuna-7b-v1.3

Total runs: 28.5K
Run Growth: 0
Growth Rate: 0.00%
Updated: June 30 2023
replicate

(Research only) IP-Adapter-FaceID can generate various style images conditioned on a face with only text prompts

Total runs: 28.1K
Run Growth: 100
Growth Rate: 0.36%
Updated: December 21 2023
replicate

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Total runs: 25.4K
Run Growth: 7.9K
Growth Rate: 31.10%
Updated: June 26 2024
replicate

Meta's Llama 2 7b Chat - GPTQ

Total runs: 20.2K
Run Growth: 0
Growth Rate: 0.00%
Updated: July 24 2023
replicate

AI-driven audio enhancement for your audio files, powered by Resemble AI

Total runs: 20.2K
Run Growth: 12.4K
Growth Rate: 61.39%
Updated: December 15 2023
replicate

sdxs-512-0.9 can generate high-resolution images in real-time based on prompt texts, trained using score distillation and feature matching

Total runs: 18.8K
Run Growth: 0
Growth Rate: 0.00%
Updated: March 28 2024
replicate

Meta's Llama 2 13b Chat - GPTQ

Total runs: 18.5K
Run Growth: 0
Growth Rate: 0.00%
Updated: July 25 2023
replicate

Stylized Audio-Driven Single Image Talking Face Animation

Total runs: 18.0K
Run Growth: 400
Growth Rate: 2.22%
Updated: October 08 2023
replicate

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Total runs: 16.8K
Run Growth: 200
Growth Rate: 1.19%
Updated: January 24 2024
replicate

ThinkDiffusionXL is a go-to model capable of amazing photorealism that's also versatile enough to generate high-quality images across a variety of styles and subjects without needing to be a prompting genius

Total runs: 15.2K
Run Growth: 100
Growth Rate: 0.66%
Updated: November 07 2023
replicate

This is wizard-vicuna-13b trained with a subset of the dataset - responses that contained alignment / moralizing were removed

Total runs: 15.1K
Run Growth: 0
Growth Rate: 0.00%
Updated: April 26 2024
replicate

Hyper FLUX 16-step by ByteDance

Total runs: 15.0K
Run Growth: 0
Growth Rate: 0.00%
Updated: August 28 2024
replicate

Mistral-7B-v0.1 fine tuned for chat with the Dolphin dataset (an open-source implementation of Microsoft's Orca)

Total runs: 13.4K
Run Growth: 0
Growth Rate: 0.00%
Updated: October 31 2023
replicate

InterpAny-Clearer: Clearer anytime frame interpolation & Manipulated interpolation

Total runs: 11.3K
Run Growth: 0
Growth Rate: 0.00%
Updated: November 30 2023
replicate

Segments an audio recording based on who is speaking (on A100)

Total runs: 11.2K
Run Growth: 300
Growth Rate: 2.68%
Updated: July 22 2023
replicate

(Research only) Moondream1 is a vision language model that performs on par with models twice its size

Total runs: 10.4K
Run Growth: 0
Growth Rate: 0.00%
Updated: January 25 2024
replicate

Image to Image enhancer using DemoFusion

Total runs: 10.1K
Run Growth: 100
Growth Rate: 0.99%
Updated: December 09 2023
replicate

Open diffusion model for high-quality video generation

Total runs: 10.0K
Run Growth: 200
Growth Rate: 2.00%
Updated: October 19 2023
replicate

Image-to-video - SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

Total runs: 9.4K
Run Growth: 1.1K
Growth Rate: 11.70%
Updated: November 23 2023
replicate

DemoFusion: Democratising High-Resolution Image Generation With No 💰

Total runs: 9.0K
Run Growth: 100
Growth Rate: 1.11%
Updated: December 04 2023
replicate

Implementation of SDXL RealVisXL_V2.0 img2img

Total runs: 8.6K
Run Growth: 100
Growth Rate: 1.16%
Updated: November 07 2023
replicate

Auto fuse a user's face onto the template image, with a similar appearance to the user

Total runs: 8.3K
Run Growth: 500
Growth Rate: 6.02%
Updated: November 16 2023
replicate

Phi-3-Mini-128K-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets

Total runs: 8.0K
Run Growth: 0
Growth Rate: 0.00%
Updated: April 26 2024
replicate

360 Panorama SDXL image with inpainted wrapping seam

Total runs: 6.2K
Run Growth: 0
Growth Rate: 0.00%
Updated: September 10 2023
replicate

Segment Anything 2 (SAM2) by Meta - Automatic mask generation

Total runs: 5.3K
Run Growth: 3.2K
Growth Rate: 60.38%
Updated: July 31 2024
replicate

Projection module trained to add vision capabilties to Llama 3 using SigLIP

Total runs: 5.3K
Run Growth: 0
Growth Rate: 0.00%
Updated: November 05 2024
replicate

Convert your videos to DensePose and use it with MagicAnimate

Total runs: 5.3K
Run Growth: 200
Growth Rate: 3.77%
Updated: December 06 2023
replicate

Fuyu-8B is a multi-modal text and image transformer trained by Adept AI

Total runs: 4.5K
Run Growth: 0
Growth Rate: 0.00%
Updated: October 20 2023
replicate

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

Total runs: 3.9K
Run Growth: 100
Growth Rate: 2.56%
Updated: February 07 2024
replicate

Controlnet v1.1 - Tile Version

Total runs: 3.9K
Run Growth: 0
Growth Rate: 0.00%
Updated: November 27 2023
replicate

SDXL using DeepCache

Total runs: 3.8K
Run Growth: 0
Growth Rate: 0.00%
Updated: January 08 2024
replicate

Playground v2 is a diffusion-based text-to-image generative model trained from scratch. Try out all 3 models here

Total runs: 3.6K
Run Growth: 0
Growth Rate: 0.00%
Updated: December 08 2023
replicate

nomic-embed-text-v1 is 8192 context length text encoder that surpasses OpenAI text-embedding-ada-002 and text-embedding-3-small performance on short and long context tasks

Total runs: 3.5K
Run Growth: 1.2K
Growth Rate: 34.29%
Updated: February 13 2024
replicate

Segmind Stable Diffusion Model (SSD-1B) img2img

Total runs: 3.5K
Run Growth: 0
Growth Rate: 0.00%
Updated: November 03 2023
replicate

Implementation of SDXL RealVisXL_V1.0 img2img

Total runs: 3.4K
Run Growth: 0
Growth Rate: 0.00%
Updated: November 02 2023
replicate

A combination of ip_adapter SDv1.5 and mediapipe-face to inpaint a face

Total runs: 3.1K
Run Growth: 100
Growth Rate: 3.23%
Updated: November 15 2023
replicate

Phi-2 by Microsoft

Total runs: 2.9K
Run Growth: 100
Growth Rate: 3.45%
Updated: January 31 2024
replicate

llava-phi-3-mini is a LLaVA model fine-tuned from microsoft/Phi-3-mini-4k-instruct

Total runs: 2.7K
Run Growth: 0
Growth Rate: 0.00%
Updated: April 30 2024
replicate

POC to run inference on SSD-1B LoRAs

Total runs: 2.7K
Run Growth: 0
Growth Rate: 0.00%
Updated: November 09 2023
replicate

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data

Total runs: 2.5K
Run Growth: 0
Growth Rate: 0.00%
Updated: April 27 2024