InstantX
/

SD3.5-Large-IP-Adapter

StableDiffusion3Pipeline

image-generation

Stable Diffusion

Model card Files Files and versions Community

SD3.5-Large-IP-Adapter / README.md

wanghaofan's picture

Update README.md

c84808e verified 5 days ago

|

history blame contribute delete

3.23 kB

	---
	license: other
	license_name: stabilityai-ai-community
	license_link: >-
	https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md
	language:
	- en
	library_name: diffusers
	pipeline_tag: text-to-image
	tags:
	- Text-to-Image
	- IP-Adapter
	- StableDiffusion3Pipeline
	- image-generation
	- Stable Diffusion
	base_model:
	- stabilityai/stable-diffusion-3.5-large
	---

	# SD3.5-Large-IP-Adapter

	This repository contains a IP-Adapter for SD3.5-Large model released by researchers from [InstantX Team](https://huggingface.co/InstantX), where image work just like text, so it may not be responsive or interfere with other text, but we do hope you enjoy this model, have fun and share your creative works with us [on Twitter](https://x.com/instantx_ai).

	# Model Card
	This is a regular IP-Adapter, where the new layers are added into all 38 blocks. We use [google/siglip-so400m-patch14-384](https://huggingface.co/google/siglip-so400m-patch14-384) to encode image for its superior performance, and adopt a TimeResampler to project. The image token number is set to 64.

	# Showcases

	<div class="container">
	<img src="./teasers/0.png" width="1024"/>
	<img src="./teasers/1.png" width="1024"/>
	</div>

	# Inference
	The code has not been integrated into diffusers yet, please use our local files at this moment.
	```python
	import torch
	from PIL import Image

	from models.transformer_sd3 import SD3Transformer2DModel
	from pipeline_stable_diffusion_3_ipa import StableDiffusion3Pipeline

	model_path = 'stabilityai/stable-diffusion-3.5-large'
	ip_adapter_path = './ip-adapter.bin'
	image_encoder_path = "google/siglip-so400m-patch14-384"

	transformer = SD3Transformer2DModel.from_pretrained(
	model_path, subfolder="transformer", torch_dtype=torch.bfloat16
	)

	pipe = StableDiffusion3Pipeline.from_pretrained(
	model_path, transformer=transformer, torch_dtype=torch.bfloat16
	).to("cuda")

	pipe.init_ipadapter(
	ip_adapter_path=ip_adapter_path,
	image_encoder_path=image_encoder_path,
	nb_token=64,
	)

	ref_img = Image.open('./assets/1.jpg').convert('RGB')

	# please note that SD3.5 Large is sensitive to highres generation like 1536x1536
	image = pipe(
	width=1024,
	height=1024,
	prompt='a cat',
	negative_prompt="lowres, low quality, worst quality",
	num_inference_steps=24,
	guidance_scale=5.0,
	generator=torch.Generator("cuda").manual_seed(42),
	clip_image=ref_img,
	ipadapter_scale=0.5,
	).images[0]
	image.save('./result.jpg')
	```

	# Community ComfyUI Support
	Please refer to [zefu-lu/ComfyUI-InstantX-SD3_5-Large-IPAdapter](https://github.com/zefu-lu/ComfyUI-InstantX-SD3_5-Large-IPAdapter).


	# License
	The model is released under [stabilityai-ai-community](https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md). All copyright reserved.

	# Acknowledgements
	This project is sponsored by [HuggingFace](https://huggingface.co/) and [fal.ai](https://fal.ai/). Thanks to [zefu-lu](https://github.com/zefu-lu) for supporting ComfyUI node.

	# Citation
	If you find this project useful in your research, please cite us via
	```
	@misc{sd35-large-ipa,
	author = {InstantX Team},
	title = {InstantX SD3.5-Large IP-Adapter Page},
	year = {2024},
	}
	```