LyCORIS-experiments / README.md
alea31415's picture
Update README.md
a8d203d
|
raw
history blame
No virus
2.96 kB
metadata
license: creativeml-openrail-m

Trigger words

Anisphia, Euphyllia, Tilty, OyamaMahiro, OyamaMihari
by onono imoko, by momoko, by mochizuki kei, by kantoku, by ke-ta
aniscreen, fanart

For 0324_all_aniscreen_tags, I accidentally tag all the character images with aniscreen.
For 0325_aniscreen_fanart_styles, things are done correctly (anime screenshots tagged as aniscreen, fanart tagged as fanart).

Settings

Default settings are

  • loha net dim 8, conv dim 4, alpha 1
  • lr 2e-4 constant scheduler throuout
  • Adam8bit
  • resolution 512
  • clip skip 1

Names of the files suggest how the setting is changed with respect to this default setup. The configuration json files can otherwsie be found in the config subdirectories that lies in each folder. However, some experiments concern the effect of tags for which I regenerate the txt file and the difference can not be seen from the configuration file in this case. For now this concerns 05tag for which tags are only used with probability 0.5.

Some observations

For a thorough comparaison please refer to the generated_samples folder.

Captioning

Dataset, in general, is the most important out of all. The common wisdom that we should prune anything that we want to be attach to the trigger word is exactly the way to go for. No tags at all is terrible, especially for style training. Having all the tags remove the traits from subjects if these tags are not used during sampling (not completely true but more or less the case).

00066-20230326090858

Others

  1. I barely see any difference for training at clip skip 1 and 2.
  2. Setting text encoder learning rate to be half of that of unet makes training two times slower while I cannot see how it helps.
  3. The difference between lora, locon, and loha are very subtle.
  4. Training at higher resolution helps generating more complex backgrounds etc, but it is very time-consuming and most of the time it isn't worth it (simpler to just switch base model) unless this is exactly the goal of the lora you're training.

Datasets

Here is the composition of the datasets

17_characters~fanart~OyamaMihari: 53
19_characters~fanart~OyamaMahiro+OyamaMihari: 47
1_artists~kantoku: 2190
24_characters~fanart~Anisphia: 37
28_characters~screenshots~Anisphia+Tilty: 24
2_artists~ke-ta: 738
2_artists~momoko: 762
2_characters~screenshots~Euphyllia: 235
3_characters~fanart~OyamaMahiro: 299
3_characters~screenshots~Anisphia: 217
3_characters~screenshots~OyamaMahiro: 210
3_characters~screenshots~OyamaMahiro+OyamaMihari: 199
3_characters~screenshots~OyamaMihari: 177
4_characters~screenshots~Anisphia+Euphyllia: 165
57_characters~fanart~Euphyllia: 16
5_artists~mochizuki_kei: 426
5_artists~onono_imoko: 373
7_characters~screenshots~Tilty: 95
9_characters~fanart~Anisphia+Euphyllia: 97