Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,38 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
---
|
4 |
+
This model has been trained to generate bike images. The dataset used is BIKED++ (found here in [GitHub](https://github.com/Lyleregenwetter/BIKED_multimodal/tree/main?utm_source=catalyzex.com))
|
5 |
+
|
6 |
+
This is a conditional diffusion model which is trained to both generate bike images and perform infill on partially masked bike images from the dataset above.
|
7 |
+
|
8 |
+
the baseline architecture was setup with diffusers UNet2DModel model for images:
|
9 |
+
|
10 |
+
```python
|
11 |
+
UNet2DModel(
|
12 |
+
sample_size=128, # the target image resolution is set to 128 here (128x128 images)
|
13 |
+
in_channels=6, # the number of input channels, 3 for RGB masked images(infill, feed all white for uncoditional) and 3 for RGB noise
|
14 |
+
out_channels=3, # the number of output channels (RGB)
|
15 |
+
layers_per_block=2,
|
16 |
+
block_out_channels=(128, 256, 512, 768),
|
17 |
+
down_block_types=(
|
18 |
+
"DownBlock2D",
|
19 |
+
"AttnDownBlock2D",
|
20 |
+
"AttnDownBlock2D",
|
21 |
+
"AttnDownBlock2D"
|
22 |
+
),
|
23 |
+
up_block_types=(
|
24 |
+
"AttnUpBlock2D",
|
25 |
+
"AttnUpBlock2D",
|
26 |
+
"AttnUpBlock2D",
|
27 |
+
"UpBlock2D"
|
28 |
+
),
|
29 |
+
)
|
30 |
+
```
|
31 |
+
|
32 |
+
The code for training and inference is included as well. There exists a postprocessing function which crops the white space above and below the bike images which the dataset was preprocessed to have such that the images become square shaped.
|
33 |
+
|
34 |
+
With only 10 denoising steps you can get uncoditional samples that mimic the dataset well.
|
35 |
+
|
36 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64d516ba80d47a6b76fc1015/VDqGqLee0XvqtiYYkzpia.png)
|
37 |
+
|
38 |
+
A pipeline with guidance is provided as well where you can feed your custom function for guidance.
|