|
--- |
|
language: |
|
- en |
|
library_name: diffusers |
|
inference: true |
|
license: other |
|
license_name: stabilityai-ai-community |
|
license_link: LICENSE.md |
|
tags: |
|
- text-to-image |
|
- stable-diffusion |
|
- diffusers |
|
base_model: |
|
- stabilityai/stable-diffusion-3.5-large |
|
- stabilityai/stable-diffusion-3.5-large-turbo |
|
base_model_relation: merge |
|
--- |
|
|
|
# Stable Diffusion 3.5 Merged |
|
|
|
This repository contains the merged version of **Stable Diffusion 3.5**, combining the best features from both the [**Large**](https://huggingface.co/stabilityai/stable-diffusion-3.5-large) and [**Turbo**](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo) variants. |
|
|
|
| Large (40 steps) | Turbo (4 steps) | Merged (6 steps) 🎉 | |
|
| :--: | :--: | :--: | |
|
| ![](./assets/large.png) | ![](./assets/turbo.png) | ![](./assets/sd-3.5-merged.png) | |
|
|
|
|
|
## Inference |
|
|
|
Run the following code to generate images using the merged model: |
|
|
|
```python |
|
from diffusers import StableDiffusion3Pipeline |
|
import torch |
|
|
|
pipeline = StableDiffusion3Pipeline.from_pretrained( |
|
"ariG23498/sd-3.5-merged", torch_dtype=torch.bfloat16 |
|
).to("cuda") |
|
|
|
prompt = "a tiny astronaut hatching from an egg on the moon" |
|
image = pipeline( |
|
prompt=prompt, |
|
guidance_scale=1.0, |
|
num_inference_steps=6, # Run faster ⚡️ |
|
generator=torch.manual_seed(0), |
|
).images[0] |
|
image.save("sd-3.5-merged.png") |
|
``` |
|
|
|
> **Note**: Turbo variant runs faster with fewer steps, while Large variant requires more steps (around 50) but provides better detail. |
|
With the merged model you would need to play with `num_inference_steps` and `guidance_scale` to get the perfect balance of speed and quality. |
|
Below I show a grid of scale and step changes and its corresponding generations. |
|
|
|
![](./assets/grid.png) |
|
|
|
## Merging Models |
|
|
|
This repository merges the **Stable Diffusion 3.5 Large** and **Stable Diffusion 3.5 Turbo** models into a single, powerful model. The Large version uses classifier-free guidance (CFG) and requires more steps, while the Turbo version is distilled for faster generation without CFG. |
|
|
|
The merged model retains the detail of the Large version and the speed of the Turbo version. |
|
|
|
### Code to Merge Models |
|
|
|
To access the Stable Diffusion 3.5 models, one needs to fill the forms in the corresponding repositories, and then `huggingface_cli login` to let your system know |
|
who you are and whether you have access to the models! |
|
|
|
```python |
|
from diffusers import SD3Transformer2DModel |
|
from huggingface_hub import snapshot_download |
|
from accelerate import init_empty_weights |
|
from diffusers.models.model_loading_utils import load_model_dict_into_meta |
|
import safetensors.torch |
|
from huggingface_hub import upload_folder |
|
import glob |
|
import torch |
|
|
|
large_model_id = "stabilityai/stable-diffusion-3.5-large" |
|
turbo_model_id = "stabilityai/stable-diffusion-3.5-large-turbo" |
|
|
|
with init_empty_weights(): |
|
config = SD3Transformer2DModel.load_config(large_model_id, subfolder="transformer") |
|
model = SD3Transformer2DModel.from_config(config) |
|
|
|
large_ckpt = snapshot_download(repo_id=large_model_id, allow_patterns="transformer/*") |
|
turbo_ckpt = snapshot_download(repo_id=turbo_model_id, allow_patterns="transformer/*") |
|
|
|
large_shards = sorted(glob.glob(f"{large_ckpt}/transformer/*.safetensors")) |
|
turbo_shards = sorted(glob.glob(f"{turbo_ckpt}/transformer/*.safetensors")) |
|
|
|
merged_state_dict = {} |
|
guidance_state_dict = {} |
|
|
|
for i in range(len((large_shards))): |
|
state_dict_large_temp = safetensors.torch.load_file(large_shards[i]) |
|
state_dict_turbo_temp = safetensors.torch.load_file(turbo_shards[i]) |
|
|
|
keys = list(state_dict_large_temp.keys()) |
|
for k in keys: |
|
if "guidance" not in k: |
|
merged_state_dict[k] = (state_dict_large_temp.pop(k) + state_dict_turbo_temp.pop(k)) / 2 |
|
else: |
|
guidance_state_dict[k] = state_dict_large_temp.pop(k) |
|
|
|
if len(state_dict_large_temp) > 0: |
|
raise ValueError(f"There should not be any residue but got: {list(state_dict_large_temp.keys())}.") |
|
if len(state_dict_turbo_temp) > 0: |
|
raise ValueError(f"There should not be any residue but got: {list(state_dict_turbo_temp.keys())}.") |
|
|
|
merged_state_dict.update(guidance_state_dict) |
|
load_model_dict_into_meta(model, merged_state_dict) |
|
|
|
model.to(torch.bfloat16).save_pretrained("transformer") |
|
|
|
upload_folder( |
|
repo_id="ariG23498/sd-3.5-merged", |
|
folder_path="transformer", |
|
path_in_repo="transformer", |
|
) |
|
``` |
|
|
|
This script downloads the checkpoints, merges them, and saves the merged model locally. You can then upload the merged model to Hugging Face Hub using `upload_folder`. |
|
|
|
## References: |
|
|
|
[FLUX.1 merged](https://huggingface.co/sayakpaul/FLUX.1-merged) from Sayak Paul! |