Kohaku XL Epsilon rev2
join us: https://discord.gg/tPBsKDyRR5
Rev2 Features
- Resumed from Kohaku XL Epsilon rev1
- 1.56M images, 5epoch
- Trained on selected artists' artworks and images about selected series/games
- Trained on PVC figure photos, can generate PVC style without any additional models
Usage (PLEASE READ THIS SECTION)
Prompt Format
<1girl/1boy/1other/...>, <character>, <series>, <artists>, <general tags>, <quality tags>, <year tags>, <meta tags>, <rating tags>
Special Tags
- Quality tags: masterpiece, best quality, great quality, good quality, normal quality, low quality, worst quality
- Rating tags: safe, sensitive, nsfw, explicit
- Date tags: newest, recent, mid, early, old
Rating tags
General: safe
Sensitive: sensitive
Questionable: nsfw
Explicit: nsfw, explicit
Resolution
This model is trained for resolutions from ARB 1024x1024 with minimum resolution 256 and maximum resolution 4096. This means you can use the standard SDXL resolution. However, opting for a slightly higher resolution than 1024x1024 is recommended. Applying a hires-fix is also suggested for better results.
Training
- Hardware: Quad RTX 3090s
- Num Train Images: 1,536,902
- Total Epoch: 5
- Total Steps: 15015
- Training Time: 410 hours (wall time)
- Batch Size: 4
- Grad Accumulation Step: 32
- Equivalent Batch Size: 512
- Optimizer: Lion8bit
- Learning Rate: 1e-5 for UNet / 2e-6 for TE
- LR Scheduler: Cosine (with warmup)
- Warmup Steps: 1000
- Weight Decay: 0.1
- Betas: 0.9, 0.95
- Min SNR Gamma: 5
- Noise Offset: 0.0357
- Resolution: 1024x1024
- Min Bucket Resolution: 256
- Max Bucket Resolution: 4096
- Mixed Precision: FP16
- Caption Tag Dropout: 0.2
- Caption Dropout: 0.05
License:
Fair-AI-public-1.0-sd