File size: 4,167 Bytes
3127e17
 
 
 
 
 
 
 
 
 
 
5c8c639
 
 
3127e17
 
5c8c639
 
e9c5e69
593f0dc
 
251c1b1
 
60a80ab
3127e17
 
cb9cf9c
 
 
 
 
 
 
 
 
 
 
 
191ae1f
455a2c1
cc7bcfb
 
 
 
 
 
 
 
 
 
 
3127e17
 
 
 
 
 
 
 
 
 
 
c206367
3127e17
 
 
 
 
 
 
 
 
 
b67c492
3127e17
 
e78ef96
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3127e17
 
 
 
e78ef96
 
3127e17
 
 
e78ef96
3127e17
 
 
355bd06
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
---
license: other
license_name: flux-1-dev-non-commercial-license
license_link: https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md
tags:
- Text-to-Image
- ControlNet
- Diffusers
- Stable Diffusion
base_model: black-forest-labs/FLUX.1-dev
---

# Found some bugs, currently fixing them. Please do not download until the fixes are applied.

# FLUX.1-dev Controlnet



<img src="./images/image_union.png" width = "1000" />


## Diffusers version

until the next Diffusers pypi release, please install Diffusers from source and use [this PR](https://github.com/huggingface/diffusers/pull/9175) to be able to use. 
TODO: change when new version.

## Checkpoint

The training of union controlnet requires a significant amount of computational power. 
The current release is only an alpha version checkpoint that has not been fully trained. 
The beta version is in the training process. 
We have conducted ablation studies that have demonstrated the validity of the code. 
The open-source release of the alpha version is solely to facilitate the rapid growth of the open-source community and the Flux ecosystem; 
it is common to encounter bad cases (please accept my apologies). 
It is worth noting that we have found that even a fully trained Union model may not perform as well as specialized models, such as pose control. 
However, as training progresses, the performance of the Union model will continue to approach that of specialized models.


## Control Mode

| Control Mode | Description | Current Model Validity |
|:------------:|:-----------:|:-----------:|
|0|canny|🟢high|
|1|tile|🟢high|
|2|depth|🟡medium|
|3|blur|🟢high|
|4|pose|🔴low|
|5|gray|🔴low|
|6|lq|🟢high|




# Demo
```python
import torch
from diffusers.utils import load_image
from diffusers.pipelines.flux.pipeline_flux_controlnet import FluxControlNetPipeline
from diffusers.models.controlnet_flux import FluxControlNetModel

# load
base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model = 'InstantX/FLUX.1-dev-Controlnet-Union-alpha'
controlnet = FluxControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16)
pipe.to("cuda")

# image cfg
width, height = 1024, 1024
controlnet_conditioning_scale = 0.6
seed = 2024

# canny
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/canny.jpg")
prompt = "A girl in city, 25 years old, cool, futuristic."
control_mode = 0

# tile
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/tile.jpg")
prompt = "A girl, 25 years old."
control_mode = 1

# depth
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/depth.jpg")
prompt = "A girl in city, 25 years old, cool, futuristic."
control_mode = 2

# blur
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/blur.jpg")
prompt = "A girl, 25 years old."
control_mode = 3

# pose
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/pose.jpg")
prompt = "A girl in city, 25 years old, cool, futuristic."
control_mode = 4

# gray
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/gray.jpg")
prompt = "A girl, 25 years old."
control_mode = 5

# low quality
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/lq.jpg")
prompt = "A girl in city"
control_mode = 6

# go go go
image = pipe(
    prompt, 
    control_image=control_image,
    control_mode=control_mode,
    width=width,
    height=height,
    controlnet_conditioning_scale=controlnet_conditioning_scale,
    num_inference_steps=28, 
    guidance_scale=3.5,
    generator=torch.manual_seed(seed),
).images[0]
image.save("image.jpg")
```



# Acknowledgements

Thank you, [zzzzzero](https://github.com/zzzzzero), for pointing out the bug in the model.