We present Painter, a generalist model using an "image"-centric solution for in-context visual learning, that is, to redefine the output of core vision tasks as images, and specify task prompts as also images. With this idea, our training process is extremely simple, which performs standard masked image modeling on the stitch of input and output image pairs. This makes the model capable of performing tasks conditioned on visible image patches. Thus, during inference, we can adopt a pair of input and output images from the same task as the input condition, to indicate which task to perform. Examples of in-context inference are illustrated in the figure above, consisting of seven in-domain examples (seven rows at top) and three out-of-domain examples (three rows at bottom).
Without bells and whistles, our generalist Painter can achieve competitive performance compared to well-established task-specific models, on seven representative vision tasks ranging from high-level visual understanding to low-level image processing.
In addition, Painter significantly outperforms recent generalist models on several challenging tasks.
A pre-trained Painter is available at
🤗 HF link
. The results on various tasks are summarized below:
depth estimation
semantic seg.
panoptic seg.
keypoint det.
denoising
deraining
enhance.
NYU v2
ADE20k
COCO 2017
COCO 2017
SIDD
5 datasets
LoL
RMSE
A.Rel
d1
mIoU
PQ
AP
PSNR
SSIM
PSNR
SSIM
PSNR
SSIM
0.288
0.080
0.950
49.9
43.4
72.1
38.66
0.954
29.42
0.867
22.34
0.872
Citation
@article{Painter,
title={Images Speak in Images: A Generalist Painter for In-Context Visual Learning},
author={Wang, Xinlong and Wang, Wen and Cao, Yue and Shen, Chunhua and Huang, Tiejun},
journal={arXiv preprint arXiv:2212.02499},
year={2022}
}
Contact
We are hiring
at all levels at BAAI Vision Team, including full-time researchers, engineers and interns.
If you are interested in working with us on
foundation model, visual perception and multimodal learning
, please contact
Xinlong Wang
(
[email protected]
) and
Yue Cao
(
[email protected]
).
Runs of BAAI Painter on huggingface.co
0
Total runs
0
24-hour runs
0
3-day runs
0
7-day runs
0
30-day runs
More Information About Painter huggingface.co Model
Painter huggingface.co is an AI model on huggingface.co that provides Painter's model effect (), which can be used instantly with this BAAI Painter model. huggingface.co supports a free trial of the Painter model, and also provides paid use of the Painter. Support call Painter model through api, including Node.js, Python, http.
Painter huggingface.co is an online trial and call api platform, which integrates Painter's modeling effects, including api services, and provides a free online trial of Painter, you can try Painter online for free by clicking the link below.
Painter is an open source model from GitHub that offers a free installation service, and any user can find Painter on GitHub to install. At the same time, huggingface.co provides the effect of Painter install, users can directly use Painter installed effect in huggingface.co for debugging and trial. It also supports api for free installation.