stabilityai
/

sdxl-turbo

@@ -1,135 +1,148 @@
----
-pipeline_tag: text-to-image
-inference: false
-license: other
-license_name: sai-nc-community
-license_link: https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md
----
-# SDXL-Turbo Model Card
-<!-- Provide a quick summary of what the model is/does. -->
 ![row01](output_tile.jpg)
-SDXL-Turbo is a fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation.
-A real-time demo is available here: http://clipdrop.co/stable-diffusion-turbo
-Please note: For commercial use, please refer to https://stability.ai/license.
-## Model Details
-### Model Description
-SDXL-Turbo is a distilled version of [SDXL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), trained for real-time synthesis.
-SDXL-Turbo is based on a novel training method called Adversarial Diffusion Distillation (ADD) (see the [technical report](https://stability.ai/research/adversarial-diffusion-distillation)), which allows sampling large-scale foundational
-image diffusion models in 1 to 4 steps at high image quality.
-This approach uses score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal and combines this with an
-adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps.
-- **Developed by:** Stability AI
-- **Funded by:** Stability AI
-- **Model type:** Generative text-to-image model
-- **Finetuned from model:** [SDXL 1.0 Base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
-### Model Sources
-For research purposes, we recommend our `generative-models` Github repository (https://github.com/Stability-AI/generative-models),
-which implements the most popular diffusion frameworks (both training and inference).
-- **Repository:** https://github.com/Stability-AI/generative-models
-- **Paper:** https://stability.ai/research/adversarial-diffusion-distillation
-- **Demo:** http://clipdrop.co/stable-diffusion-turbo
-## Evaluation
-![comparison1](image_quality_one_step.png)
-![comparison2](prompt_alignment_one_step.png)
-The charts above evaluate user preference for SDXL-Turbo over other single- and multi-step models.
-SDXL-Turbo evaluated at a single step is preferred by human voters in terms of image quality and prompt following over LCM-XL evaluated at four (or fewer) steps.
-In addition, we see that using four steps for SDXL-Turbo further improves performance.
-For details on the user study, we refer to the [research paper](https://stability.ai/research/adversarial-diffusion-distillation).
-## Uses
-### Direct Use
-The model is intended for both non-commercial and commercial usage. You can use this model for non-commercial or research purposes under this [license](https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md). Possible research areas and tasks include
-- Research on generative models.
-- Research on real-time applications of generative models.
-- Research on the impact of real-time generative models.
-- Safe deployment of models which have the potential to generate harmful content.
-- Probing and understanding the limitations and biases of generative models.
-- Generation of artworks and use in design and other artistic processes.
-- Applications in educational or creative tools.
-For commercial use, please refer to https://stability.ai/membership.
-Excluded uses are described below.
-### Diffusers
 ```
-pip install diffusers transformers accelerate --upgrade
 ```
-- **Text-to-image**:
-SDXL-Turbo does not make use of `guidance_scale` or `negative_prompt`, we disable it with `guidance_scale=0.0`.
-Preferably, the model generates images of size 512x512 but higher image sizes work as well.
-A **single step** is enough to generate high quality images.
 ```py
-from diffusers import AutoPipelineForText2Image
-import torch
-pipe = AutoPipelineForText2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16")
-pipe.to("cuda")
-prompt = "A cinematic shot of a baby racoon wearing an intricate italian priest robe."
-image = pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0.0).images[0]
 ```
-- **Image-to-image**:
-When using SDXL-Turbo for image-to-image generation, make sure that `num_inference_steps` * `strength` is larger or equal
-to 1. The image-to-image pipeline will run for `int(num_inference_steps * strength)` steps, *e.g.* 0.5 * 2.0 = 1 step in our example
-below.
 ```py
-from diffusers import AutoPipelineForImage2Image
-from diffusers.utils import load_image
-import torch
-pipe = AutoPipelineForImage2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16")
-pipe.to("cuda")
-init_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png").resize((512, 512))
-prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k"
-image = pipe(prompt, image=init_image, num_inference_steps=2, strength=0.5, guidance_scale=0.0).images[0]
 ```
-### Out-of-Scope Use
-The model was not trained to be factual or true representations of people or events,
-and therefore using the model to generate such content is out-of-scope for the abilities of this model.
-The model should not be used in any way that violates Stability AI's [Acceptable Use Policy](https://stability.ai/use-policy).
-## Limitations and Bias
-### Limitations
-- The generated images are of a fixed resolution (512x512 pix), and the model does not achieve perfect photorealism.
-- The model cannot render legible text.
-- Faces and people in general may not be generated properly.
-- The autoencoding part of the model is lossy.
-### Recommendations
-The model is intended for both non-commercial and commercial usage.
-## How to Get Started with the Model
-Check out https://github.com/Stability-AI/generative-models

+---
+pipeline_tag: image-to-video
+inference: false
+license: other
+license_name: sai-nc-community
+license_link: https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md
+datasets:
+- neuralwork/arxiver
+language:
+- af
+metrics:
+- bleu
+- code_eval
+base_model:
+- genmo/mochi-1-preview
+new_version: genmo/mochi-1-preview
+library_name: fasttext
+tags:
+- legal
+---
+#SDXL-Turbo型号卡
+<！--提供模型功能的快速摘要。-->
 ![row01](output_tile.jpg)
+SDXL-Turbo是一种快速生成的文本到图像模型，可以在单个网络评估中从文本提示合成照片级真实感图像。
+实时演示可在以下位置获得：http://clipdrop.co/stable-diffusion-turbo
+请注意：对于商业用途，请参阅https://stability.ai/license.
+##模型详细信息
+###型号说明
+SDXL-Turbo是[SDXL1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)，受过实时合成训练。
+SDXL-Turbo基于一种称为对抗扩散蒸馏(ADD)的新型训练方法(参见[技术报告](https://stability.ai/research/adversarial-diffusion-distillation))，这允许对基础
+图像扩散模型在高图像质量的1到4步中进行。
+该方法使用分数蒸馏来利用大规模现成的图像扩散模型作为教师信号，并将其与
+对抗性损失，以确保即使在一个或两个采样步骤的低阶区域中也具有高图像保真度。
+- **编制单位：**稳定性AI
+- **资金来源：**稳定性AI
+- **型号类型：**生成文本到图像模型
+- **根据模型进行微调：** [SDXL1.0Base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
+###模型源
+出于研究目的，我们建议`生成模型`GitHub存储库(https://github.com/Stability-AI/generative-models),
+它实现了最流行的传播框架(训练和推理)。
+- **存储库：**https://github.com/Stability-AI/generative-models
+- **纸：**https://stability.ai/research/adversarial-diffusion-distillation
+- **演示：**http://clipdrop.co/stable-diffusion-turbo
+##评价
+![比较1](image_quality_one_step.png)
+![比较2](prompt_alignment_one_step.png)
+以上图表评估了SDXL-Turbo与其他单步和多步型号相比的用户偏好。
+在图像质量和即时跟踪方面，在单步评估的SDXL-Turbo比在四步(或更少)评估的LCM-XL更受人类选民的青睐。
+此外，我们看到SDXL-Turbo使用四个步骤进一步提高了性能。
+有关用户研究的详细信息，请参阅[研究论���](https://stability.ai/research/adversarial-diffusion-distillation).
+##uses
+###直接使用
+该模型既可用于非商业用途，也可用于商业用途。您可根据本协议将该模型用于非商业用途或研究用途。[许可证](https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md)。可能的研究领域和任务包括
+-生成模型研究。
+-生成模型的实时应用研究。
+-研究实时生成模型的影响。
+-安全部署可能产生有害内容的模型。
+-探索和理解生成模型的局限性和偏差。
+-艺术作品的产生和在设计和其他艺术过程中的使用。
+-在教育或创意工具中的应用。
+对于商业用途，请参阅https://stability.ai/membership.
+排除的用途描述如下。
+###扩散器
 ```
+PIP安装扩散器变压器加速-升级
 ```
+- **文本到图像**:
+SDXL-Turbo不使用`制导_标度`或`negative_prompt`，我们用来禁用它`guidance_scale=0.0`.
+优选地，模型生成尺寸为512x512的图像，但是更大的图像尺寸也可以工作。
+A**单步**足以生成高质量图像。
 ```py
+从散流器导入AutoPipelineForText2Image
+进口火炬
+管道=AutoPipelineForText2Image。from_pretrained("stabilityai/sdxl-turbo"，torch_dtype=torch.float16，variant="fp16")
+pipe.to(“cuda”)
+prompt="一只小浣熊穿着复杂的意大利牧师长袍的电影镜头。"
+image=管道(提示=提示，num_interference_steps=1，guidance_scale=0.0)。images[0]
 ```
+- **图像到图像**:
+使用SDXL-Turbo进行图像到图像生成时，请确保`NUM_interference_steps`*`力量`大于或等于
+到1。映像到映像管道将运行`int(num_interference_steps*strength)`步骤，*例如*在我们的示例中，0.5*2.0=1步
+在……下面。
 ```py
+从散流器导入AutoPipelineForImage2Image
+从diffusers.utils导入load_image
+进口火炬
+管道=AutoPipelineForImage2Image。from_pretrained("stabilityai/sdxl-turbo"，torch_dtype=torch.float16，variant="fp16")
+pipe.to(“cuda”)
+init_image=load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png").resize((512，512))
+prompt="猫巫师，甘道夫，指环王，详细，梦幻，可爱，可爱，皮克斯，迪斯尼，8k"
+image=管道(提示，image=init_image，num_interference_steps=2，strength=0.5，guidance_scale=0.0)。images[0]
 ```
+###超出范围使用
+模型并没有被训练成真实或真实的人或事件的表现，
+因此，使用该模型来生成这样的内容超出了该模型的能力的范围。
+该模型不应以任何违反稳定性人工智能的方式使用[可接受的使用政策](https://stability.ai/use-policy).
+##限制和偏差
+###限制
+-生成的图像具有固定的分辨率(512x512像素)，并且模型不能实现完美的照片真实感。
+-模型无法呈现清晰可见的文本。
+-一般情况下，人脸和人物可能无法正确生成。
+-模型的自动编码部分有损耗。
+###推荐
+该模型既可用于非商业用途，也可用于商业用途。
+##如何开始使用模型
+结帐https://github.com/Stability-AI/generative-models