sd-3.5-merged / README.md

adding login details

fc72d85 9 days ago

4.67 kB

	---
	language:
	- en
	library_name: diffusers
	inference: true
	license: other
	license_name: stabilityai-ai-community
	license_link: LICENSE.md
	tags:
	- text-to-image
	- stable-diffusion
	- diffusers
	base_model:
	- stabilityai/stable-diffusion-3.5-large
	- stabilityai/stable-diffusion-3.5-large-turbo
	base_model_relation: merge
	---

	# Stable Diffusion 3.5 Merged

	This repository contains the merged version of Stable Diffusion 3.5, combining the best features from both the [Large](https://huggingface.co/stabilityai/stable-diffusion-3.5-large) and [Turbo](https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo) variants.

	\| Large (40 steps) \| Turbo (4 steps) \| Merged (6 steps) 🎉 \|
	\| :--: \| :--: \| :--: \|
	\| ![](./assets/large.png) \| ![](./assets/turbo.png) \| ![](./assets/sd-3.5-merged.png) \|


	## Inference

	Run the following code to generate images using the merged model:

	```python
	from diffusers import StableDiffusion3Pipeline
	import torch

	pipeline = StableDiffusion3Pipeline.from_pretrained(
	"ariG23498/sd-3.5-merged", torch_dtype=torch.bfloat16
	).to("cuda")

	prompt = "a tiny astronaut hatching from an egg on the moon"
	image = pipeline(
	prompt=prompt,
	guidance_scale=1.0,
	num_inference_steps=6, # Run faster ⚡️
	generator=torch.manual_seed(0),
	).images[0]
	image.save("sd-3.5-merged.png")
	```

	> Note: Turbo variant runs faster with fewer steps, while Large variant requires more steps (around 50) but provides better detail.
	With the merged model you would need to play with `num_inference_steps` and `guidance_scale` to get the perfect balance of speed and quality.
	Below I show a grid of scale and step changes and its corresponding generations.

	![](./assets/grid.png)

	## Merging Models

	This repository merges the Stable Diffusion 3.5 Large and Stable Diffusion 3.5 Turbo models into a single, powerful model. The Large version uses classifier-free guidance (CFG) and requires more steps, while the Turbo version is distilled for faster generation without CFG.

	The merged model retains the detail of the Large version and the speed of the Turbo version.

	### Code to Merge Models

	To access the Stable Diffusion 3.5 models, one needs to fill the forms in the corresponding repositories, and then `huggingface_cli login` to let your system know
	who you are and whether you have access to the models!

	```python
	from diffusers import SD3Transformer2DModel
	from huggingface_hub import snapshot_download
	from accelerate import init_empty_weights
	from diffusers.models.model_loading_utils import load_model_dict_into_meta
	import safetensors.torch
	from huggingface_hub import upload_folder
	import glob
	import torch

	large_model_id = "stabilityai/stable-diffusion-3.5-large"
	turbo_model_id = "stabilityai/stable-diffusion-3.5-large-turbo"

	with init_empty_weights():
	config = SD3Transformer2DModel.load_config(large_model_id, subfolder="transformer")
	model = SD3Transformer2DModel.from_config(config)

	large_ckpt = snapshot_download(repo_id=large_model_id, allow_patterns="transformer/*")
	turbo_ckpt = snapshot_download(repo_id=turbo_model_id, allow_patterns="transformer/*")

	large_shards = sorted(glob.glob(f"{large_ckpt}/transformer/*.safetensors"))
	turbo_shards = sorted(glob.glob(f"{turbo_ckpt}/transformer/*.safetensors"))

	merged_state_dict = {}
	guidance_state_dict = {}

	for i in range(len((large_shards))):
	state_dict_large_temp = safetensors.torch.load_file(large_shards[i])
	state_dict_turbo_temp = safetensors.torch.load_file(turbo_shards[i])

	keys = list(state_dict_large_temp.keys())
	for k in keys:
	if "guidance" not in k:
	merged_state_dict[k] = (state_dict_large_temp.pop(k) + state_dict_turbo_temp.pop(k)) / 2
	else:
	guidance_state_dict[k] = state_dict_large_temp.pop(k)

	if len(state_dict_large_temp) > 0:
	raise ValueError(f"There should not be any residue but got: {list(state_dict_large_temp.keys())}.")
	if len(state_dict_turbo_temp) > 0:
	raise ValueError(f"There should not be any residue but got: {list(state_dict_turbo_temp.keys())}.")

	merged_state_dict.update(guidance_state_dict)
	load_model_dict_into_meta(model, merged_state_dict)

	model.to(torch.bfloat16).save_pretrained("transformer")

	upload_folder(
	repo_id="ariG23498/sd-3.5-merged",
	folder_path="transformer",
	path_in_repo="transformer",
	)
	```

	This script downloads the checkpoints, merges them, and saves the merged model locally. You can then upload the merged model to Hugging Face Hub using `upload_folder`.

	## References:

	[FLUX.1 merged](https://huggingface.co/sayakpaul/FLUX.1-merged) from Sayak Paul!