Our model performs well in aligning multilanguage and is the strongest open-source version on the market today, retaining most of the stable diffusion capabilities of the original, and in some cases even better than the original model.
AltDiffusion-m9 model is backed by a multilingual CLIP model named AltCLIP-m9, which is also accessible in FlagAI. You can read
this tutorial
for more information.
If you find this work helpful, please consider to cite
@article{https://doi.org/10.48550/arxiv.2211.06679,
doi = {10.48550/ARXIV.2211.06679},
url = {https://arxiv.org/abs/2211.06679},
author = {Chen, Zhongzhi and Liu, Guang and Zhang, Bo-Wen and Ye, Fulong and Yang, Qinghong and Wu, Ledell},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences},
title = {AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities},
publisher = {arXiv},
year = {2022},
copyright = {arXiv.org perpetual, non-exclusive license}
}
模型权重 Model Weights
第一次运行AltDiffusion-m9模型时会自动从huggingface下载如下权重,
The following weights are automatically downloaded from HF when the AltDiffusion-m9 model is run for the first time:
模型名称 Model name
大小 Size
描述 Description
StableDiffusionSafetyChecker
1.13G
图片的安全检查器;Safety checker for image
AltDiffusion-m9
8.0G
support English(En), Chinese(Zh), Spanish(Es), French(Fr), Russian(Ru), Japanese(Ja), Korean(Ko), Arabic(Ar) and Italian(It)
AltCLIP-m9
3.22G
support English(En), Chinese(Zh), Spanish(Es), French(Fr), Russian(Ru), Japanese(Ja), Korean(Ko), Arabic(Ar) and Italian(It)
More parameters of predict_generate_images for you to adjust for
predict_generate_images
are listed below:
参数名 Parameter
类型 Type
描述 Description
prompt
str
提示文本; The prompt text
out_path
str
输出路径; The output path to save images
n_samples
int
输出图片数量; Number of images to be generate
skip_grid
bool
如果为True, 会将所有图片拼接在一起,输出一张新的图片; If set to true, image gridding step will be skipped
ddim_step
int
DDIM模型的步数; Number of steps in ddim model
plms
bool
如果为True, 则会使用plms模型; If set to true, PLMS Sampler instead of DDIM Sampler will be applied
scale
float
这个值决定了文本在多大程度上影响生成的图片,值越大影响力越强; This value determines how important the prompt incluences generate images
H
int
图片的高度; Height of image
W
int
图片的宽度; Width of image
C
int
图片的channel数; Numeber of channels of generated images
seed
int
随机种子; Random seed number
注意:模型推理要求一张至少10G以上的GPU。
Note that the model inference requires a GPU of at least 10G above.
更多生成结果 More Results
multilanguage examples
同一句prompts不同语言生成的人脸不一样!
One prompts in different languages generates different faces!
中英文对齐能力 Chinese and English alignment ability
prompt:dark elf princess, highly detailed, d & d, fantasy, highly detailed, digital painting, trending on artstation, concept art, sharp focus, illustration, art by artgerm and greg rutkowski and fuji choko and viktoria gavrilenko and hoang lap
英文生成结果/Generated results from English prompts
prompt:黑暗精灵公主,非常详细,幻想,非常详细,数字绘画,概念艺术,敏锐的焦点,插图
中文生成结果/Generated results from Chinese prompts
中文表现能力/The performance for Chinese prompts
prompt:带墨镜的男孩肖像,充满细节,8K高清
prompt:带墨镜的中国男孩肖像,充满细节,8K高清
长图生成能力/The ability to generate long images
prompt: 一只带着帽子的小狗
原版 stable diffusion:
Ours:
注: 此处长图生成技术由右脑科技(RightBrain AI)提供。
Note: The long image generation technology here is provided by Right Brain Technology.
模型参数量/Number of Model Parameters
模块名称 Module Name
参数量 Number of Parameters
AutoEncoder
83.7M
Unet
865M
AltCLIP-m9 TextEncoder
859M
引用/Citation
Please cite our paper if you find it helpful :)
@misc{ye2023altdiffusion,
title={AltDiffusion: A Multilingual Text-to-Image Diffusion Model},
author={Fulong Ye and Guang Liu and Xinya Wu and Ledell Wu},
year={2023},
eprint={2308.09991},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
许可/License
该模型通过
CreativeML Open RAIL-M license
获得许可。作者对您生成的输出不主张任何权利,您可以自由使用它们并对它们的使用负责,不得违反本许可中的规定。该许可证禁止您分享任何违反任何法律、对他人造成伤害、传播任何可能造成伤害的个人信息、传播错误信息和针对弱势群体的任何内容。您可以出于商业目的修改和使用模型,但必须包含相同使用限制的副本。有关限制的完整列表,请
阅读许可证
。
The model is licensed with a
CreativeML Open RAIL-M license
. The authors claim no rights on the outputs you generate, you are free to use them and are accountable for their use which must not go against the provisions set in this license. The license forbids you from sharing any content that violates any laws, produce any harm to a person, disseminate any personal information that would be meant for harm, spread misinformation and target vulnerable groups. You can modify and use the model for commercial purposes, but a copy of the same use restrictions must be included. For the full list of restrictions please
read the license
.
Runs of BAAI AltDiffusion-m9 on huggingface.co
59
Total runs
3
24-hour runs
4
3-day runs
0
7-day runs
8
30-day runs
More Information About AltDiffusion-m9 huggingface.co Model
AltDiffusion-m9 huggingface.co is an AI model on huggingface.co that provides AltDiffusion-m9's model effect (), which can be used instantly with this BAAI AltDiffusion-m9 model. huggingface.co supports a free trial of the AltDiffusion-m9 model, and also provides paid use of the AltDiffusion-m9. Support call AltDiffusion-m9 model through api, including Node.js, Python, http.
AltDiffusion-m9 huggingface.co is an online trial and call api platform, which integrates AltDiffusion-m9's modeling effects, including api services, and provides a free online trial of AltDiffusion-m9, you can try AltDiffusion-m9 online for free by clicking the link below.
BAAI AltDiffusion-m9 online free url in huggingface.co:
AltDiffusion-m9 is an open source model from GitHub that offers a free installation service, and any user can find AltDiffusion-m9 on GitHub to install. At the same time, huggingface.co provides the effect of AltDiffusion-m9 install, users can directly use AltDiffusion-m9 installed effect in huggingface.co for debugging and trial. It also supports api for free installation.