|
--- |
|
license: other |
|
license_name: sacla |
|
license_link: >- |
|
https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md |
|
base_model: |
|
- stabilityai/stable-diffusion-3.5-large |
|
base_model_relation: quantized |
|
--- |
|
## Overview |
|
These models are made to work with [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) release [master-ac54e00](https://github.com/leejet/stable-diffusion.cpp/releases/tag/master-ac54e00) onwards. Support for other inference backends is not guarenteed. |
|
|
|
Quantized using this PR https://github.com/leejet/stable-diffusion.cpp/pull/447 |
|
|
|
Normal K-quants are not working properly with SD3.5-Large models because around 90% of the weights are in tensors whose shape doesn't match the 256 superblock size of K-quants and therefore can't be quantized this way. |
|
Mixing quantization types allows us to take adventage of the better fidelity of k-quants to some extent while keeping the model file size relatively small. |
|
|
|
Only the second layers of both MLPs in each MMDiT block of SD3.5 Large models have the correct shape to be compatible with k-quants. That still makes up for about 10% of all the parameters. |
|
|
|
## Files: |
|
|
|
### Mixed Types: |
|
|
|
|
|
- [sd3.5_large-q2_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-GGUF-mixed-sdcpp/blob/main/sd3.5_large-q2_k_4_0.gguf): Smallest quantization yet. Use this if you can't afford anything bigger |
|
- [sd3.5_large-q3_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-GGUF-mixed-sdcpp/blob/main/sd3.5_large-q3_k_4_0.gguf) |
|
- [sd3.5_large-q4_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-GGUF-mixed-sdcpp/blob/main/sd3.5_large-q4_k_4_0.gguf): Exacty same size as q4_0, but with slightly less degradation. Recommended |
|
- [sd3.5_large_turbo-q4_k_4_1.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q4_k_4_1.gguf): Smaller than q4_1, and with comparable degradation. Recommended |
|
- [sd3.5_large_turbo-q4_k_5_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q4_k_5_0.gguf): Smaller than q5_0, and with comparable degradation. Very close to the original f16 already. Recommended |
|
|
|
### Legacy types: |
|
|
|
- [sd3.5_large_turbo-q4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q4_0.gguf): Same size as q4_k_4_0, Not recommended (use q4_k_4_0 instead) |
|
- [sd3.5_large_turbo-q4_1.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q4_1.gguf): Not recommended (q4_k_4_1 is better and smaller) |
|
- [sd3.5_large_turbo-q5_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q5_0.gguf): Barely better and bigger than q4_k_5_0 |
|
- [sd3.5_large_turbo-q5_1.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q5_1.gguf): Better and bigger than q5_0 |
|
- [sd3.5_large_turbo-q8_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q8_0.gguf): Basically indistinguishable from the original f16, but much smaller. Recommended for best quality |
|
|
|
## Outputs: |
|
|
|
Sorted by model size (Note that q4_0 and q4_k_4_0 are the exact same size) |
|
|
|
| Quantization | Robot girl | Text | Cute kitten | |
|
| ------------------ | -------------------------------- | ---------------------------------- | ---------------------------------- | |
|
| q2_k_4_0 | ![q2_k_4_0](Images/q2_k_4_0.png) | ![q2_k_4_0](Images/1_q2_k_4_0.png) | ![q2_k_4_0](Images/2_q2_k_4_0.png) | |
|
| q3_k_4_0 | ![q3_k_4_0](Images/q3_k_4_0.png) | ![q3_k_4_0](Images/1_q3_k_4_0.png) | ![q3_k_4_0](Images/2_q3_k_4_0.png) | |
|
| q4_0 | ![q4_0](Images/q4_0.png) | ![q4_0](Images/1_q4_0.png) | ![q4_0](Images/2_q4_0.png) | |
|
| q4_k_4_0 | ![q4_k_4_0](Images/q4_k_4_0.png) | ![q4_k_4_0](Images/1_q4_k_4_0.png) | ![q4_k_4_0](Images/2_q4_k_4_0.png) | |
|
| q4_k_4_1 | ![q4_k_4_1](Images/q4_k_4_1.png) | ![q4_k_4_1](Images/1_q4_k_4_1.png) | ![q4_k_4_1](Images/2_q4_k_4_1.png) | |
|
| q4_1 | ![q4_1](Images/q4_1.png) | ![q4_1](Images/1_q4_1.png) | ![q4_1](Images/2_q4_1.png) | |
|
| q4_k_5_0 | ![q4_k_5_0](Images/q4_k_5_0.png) | ![q4_k_5_0](Images/1_q4_k_5_0.png) | ![q4_k_5_0](Images/2_q4_k_5_0.png) | |
|
| q5_0 | ![q5_0](Images/q5_0.png) | ![q5_0](Images/1_q5_0.png) | ![q5_0](Images/2_q5_0.png) | |
|
|
|
only 28 steps, cfg scale 4.5 |
|
|
|
Generated with a modified version of sdcpp with [this PR](https://github.com/leejet/stable-diffusion.cpp/pull/397) applied to enable clip timestep embeddings support. |
|
|
|
Text encoders used: q4_k quant of t5xxl, full precision clip_g, and q8 quant of [ViT-L-14-TEXT-detail-improved-hiT-GmP-TE-only-HF](https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14) in place of clip_l. |
|
|
|
Full prompts and settings in png metadata. |
|
|