AltDiffusion-m18 是一种基于@StableDiffusion 的多语言文本图像生成模型。该模型由 Stability AI 和@BAAI FlagAI 团队合作完成(FlagAI 是 LF AI & Data Foundation 的沙盒阶段项目)。AltDiffusion-m18目前支持 18 种语言,包含:英语、中文、日语、泰语、韩语、印地语、乌克兰语、阿拉伯语、土耳其语、越南语、波兰语、荷兰语、葡萄牙语、意大利语、西班牙语、德语、法语和俄语。
AltDiffusion-m18 is a multilingual text-image generation model built on @StableDiffusion. This model is a collaboration between Stability AI & @BAAI FlagAI team (FlagAI is a sandbox-stage project of LF AI & Data Foundation). AltDiffusion-m18 currently supports 18 languages, including English, Chinese, Japanese, Thai, Korean, Hindi, Ukrainian, Arabic, Turkish, Vietnamese, Polish, Dutch, Portuguese, Italian, Spanish, German, French, and Russian.
As shown in Figure 1, the training process consists of two stages: concept alignment and quality improvement. We first replaced the original OpenCLIP in SD with the multilingual CLIP AltCLIP-m18 and froze its parameters. In the first stage, we trained the k,v matrices in the CrossAttention layer of the Unet model to align the concepts between text and image using 256*256 image resolution. In the second stage, we trained all the parameters in the Unet model to improve the generation performance using 512*512 image resolution.
图1: AltDiffusion示意图 (Fig.1: illustrate for AltDiffusion)
In the first stage, we trained the model using LAION 5B-en(2.32B) from
LAION 5B
and filtered LAION 5B-multi(1.8B) data for the 18 languages. In the second stage, we trained the model using LAION Aesthetics V1-en(52M) from
LAION Aesthetics V1
and filtered LAION Aesthetics V1-multi(46M) data for the 18 languages.
The first stage involved using the SD v2.1 512-base-ema checkpoint to initialize all parameters except for the language model, with a batch size of 3072 and a resolution of 256x256 for training on LAION2B en and LAION2Bmulti for 330k steps over approximately 8 days. In the second stage, training began from the 330k step checkpoint, with a batch size of 3840 on LAION Aesthetics V1-en and V1-multi, and training for 270k steps with a resolution of 512x512, taking around 7 days. Training then continued from the 270k step checkpoint for another 150k steps, with 10% of the text randomly discarded for classifierfree guidance learning, taking approximately 4 days. The teacher model of AltCLIP is OpenCLIP ViT-H-14(version is ”laion2b s32b b79k”). The pretrained Stable Diffusion
checkpoint we used is SD v2.1 512-base-ema. We also use Xformer and Efficient Attention to save memory use and speed up training. The decay of EMA is 0.9999.
@misc{ye2023altdiffusion,
title={AltDiffusion: A Multilingual Text-to-Image Diffusion Model},
author={Fulong Ye and Guang Liu and Xinya Wu and Ledell Wu},
year={2023},
eprint={2308.09991},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
许可/License
该模型通过
CreativeML Open RAIL-M license
获得许可。作者对您生成的输出不主张任何权利,您可以自由使用它们并对它们的使用负责,不得违反本许可中的规定。该许可证禁止您分享任何违反任何法律、对他人造成伤害、传播任何可能造成伤害的个人信息、传播错误信息和针对弱势群体的任何内容。您可以出于商业目的修改和使用模型,但必须包含相同使用限制的副本。有关限制的完整列表,请
阅读许可证
。
The model is licensed with a
CreativeML Open RAIL-M license
. The authors claim no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in this license. The license forbids you from sharing any content that violates any laws, produce any harm to a person, disseminate any personal information that would be meant for harm, spread misinformation and target vulnerable groups. You can modify and use the model for commercial purposes, but a copy of the same use restrictions must be included. For the full list of restrictions please
read the license
.
Runs of BAAI AltDiffusion-m18 on huggingface.co
54
Total runs
-1
24-hour runs
19
3-day runs
20
7-day runs
35
30-day runs
More Information About AltDiffusion-m18 huggingface.co Model
AltDiffusion-m18 huggingface.co
AltDiffusion-m18 huggingface.co is an AI model on huggingface.co that provides AltDiffusion-m18's model effect (), which can be used instantly with this BAAI AltDiffusion-m18 model. huggingface.co supports a free trial of the AltDiffusion-m18 model, and also provides paid use of the AltDiffusion-m18. Support call AltDiffusion-m18 model through api, including Node.js, Python, http.
AltDiffusion-m18 huggingface.co is an online trial and call api platform, which integrates AltDiffusion-m18's modeling effects, including api services, and provides a free online trial of AltDiffusion-m18, you can try AltDiffusion-m18 online for free by clicking the link below.
BAAI AltDiffusion-m18 online free url in huggingface.co:
AltDiffusion-m18 is an open source model from GitHub that offers a free installation service, and any user can find AltDiffusion-m18 on GitHub to install. At the same time, huggingface.co provides the effect of AltDiffusion-m18 install, users can directly use AltDiffusion-m18 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.