Spaces:
Sleeping
Sleeping
Update app.py
Browse files
app.py
CHANGED
@@ -18,7 +18,8 @@ article = r"""
|
|
18 |
π **Citation**
|
19 |
<br>
|
20 |
If our work is helpful for your research or applications, please cite us via:
|
21 |
-
```
|
|
|
22 |
@article{toker2024diffusion,
|
23 |
title={Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines},
|
24 |
author={Toker, Michael and Orgad, Hadas and Ventura, Mor and Arad, Dana and Belinkov, Yonatan},
|
@@ -26,12 +27,27 @@ If our work is helpful for your research or applications, please cite us via:
|
|
26 |
year={2024}
|
27 |
}
|
28 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
π§ **Contact**
|
30 |
<br>
|
31 |
-
If you have any questions, please feel free to open an issue or directly reach us out at <b>tok@cs.
|
|
|
32 |
"""
|
33 |
|
34 |
|
|
|
|
|
35 |
model_num_of_layers = {
|
36 |
'Stable Diffusion 1.4': 12,
|
37 |
'Stable Diffusion 2.1': 22,
|
|
|
18 |
π **Citation**
|
19 |
<br>
|
20 |
If our work is helpful for your research or applications, please cite us via:
|
21 |
+
```
|
22 |
+
bibtex
|
23 |
@article{toker2024diffusion,
|
24 |
title={Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines},
|
25 |
author={Toker, Michael and Orgad, Hadas and Ventura, Mor and Arad, Dana and Belinkov, Yonatan},
|
|
|
27 |
year={2024}
|
28 |
}
|
29 |
```
|
30 |
+
π§ **Abstact**
|
31 |
+
<br>
|
32 |
+
Text-to-image diffusion models (T2I) use a latent representation of a text prompt to guide the image generation process.
|
33 |
+
However, the process by which the encoder produces the text representation is unknown.
|
34 |
+
We propose the Diffusion Lens, a method for analyzing the text encoder of T2I models by generating images from its intermediate representations.
|
35 |
+
Using the Diffusion Lens, we perform an extensive analysis of two recent T2I models.
|
36 |
+
Exploring compound prompts, we find that complex scenes describing multiple objects are composed progressively and more slowly compared to simple scenes;
|
37 |
+
Exploring knowledge retrieval, we find that representation of uncommon concepts requires further computation compared to common concepts,
|
38 |
+
and that knowledge retrieval is gradual across layers.
|
39 |
+
Overall, our findings provide valuable insights into the text encoder component in T2I pipelines.
|
40 |
+
<br>
|
41 |
+
```
|
42 |
π§ **Contact**
|
43 |
<br>
|
44 |
+
If you have any questions, please feel free to open an issue or directly reach us out at <b>tok@cs.technion.ac.il
|
45 |
+
</b>.
|
46 |
"""
|
47 |
|
48 |
|
49 |
+
|
50 |
+
|
51 |
model_num_of_layers = {
|
52 |
'Stable Diffusion 1.4': 12,
|
53 |
'Stable Diffusion 2.1': 22,
|