SD3.5-Large-GGUF-mixed-sdcpp / README.md

Update README.md

f1fd453 verified 5 days ago

4.73 kB

	---
	license: other
	license_name: sacla
	license_link: >-
	https://huggingface.co/stabilityai/stable-diffusion-3.5-large/blob/main/LICENSE.md
	base_model:
	- stabilityai/stable-diffusion-3.5-large
	base_model_relation: quantized
	---
	## Overview
	These models are made to work with [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) release [master-ac54e00](https://github.com/leejet/stable-diffusion.cpp/releases/tag/master-ac54e00) onwards. Support for other inference backends is not guarenteed.

	Quantized using this PR https://github.com/leejet/stable-diffusion.cpp/pull/447

	Normal K-quants are not working properly with SD3.5-Large models because around 90% of the weights are in tensors whose shape doesn't match the 256 superblock size of K-quants and therefore can't be quantized this way. Mixing quantization types allows us to take adventage of the better fidelity of k-quants to some extent while keeping the model file size relatively small.

	## Files:

	### Mixed Types:


	- [sd3.5_large-q2_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-GGUF-mixed-sdcpp/blob/main/sd3.5_large-q2_k_4_0.gguf): Smallest quantization yet. Use this if you can't afford anything bigger
	- [sd3.5_large-q3_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-GGUF-mixed-sdcpp/blob/main/sd3.5_large-q3_k_4_0.gguf)
	- [sd3.5_large-q4_k_4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-GGUF-mixed-sdcpp/blob/main/sd3.5_large-q4_k_4_0.gguf): Exacty same size as q4_0, but with slightly less degradation. Recommended
	- [sd3.5_large_turbo-q4_k_4_1.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q4_k_4_1.gguf): Smaller than q4_1, and with comparable degradation. Recommended
	- [sd3.5_large_turbo-q4_k_5_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/sd3.5_large_turbo-q4_k_5_0.gguf): Smaller than q5_0, and with comparable degradation. Very close to the original f16 already. Recommended

	### Legacy types:

	- [sd3.5_large_turbo-q4_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q4_0.gguf): Same size as q4_k_4_0, Not recommended (use q4_k_4_0 instead)
	- [sd3.5_large_turbo-q4_1.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q4_1.gguf): Not recommended (q4_k_4_1 is better and smaller)
	- [sd3.5_large_turbo-q5_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q5_0.gguf): Barely better and bigger than q4_k_5_0
	- [sd3.5_large_turbo-q5_1.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q5_1.gguf): Better and bigger than q5_0
	- [sd3.5_large_turbo-q8_0.gguf](https://huggingface.co/stduhpf/SD3.5-Large-Turbo-GGUF-mixed-sdcpp/blob/main/legacy/sd3.5_large_turbo-q8_0.gguf): Basically indistinguishable from the original f16, but much smaller. Recommended for best quality

	## Outputs:

	Sorted by model size (Note that q4_0 and q4_k_4_0 are the exact same size)

	\| Quantization \| Robot girl \| Text \| Cute kitten \|
	\| ------------------ \| -------------------------------- \| ---------------------------------- \| ---------------------------------- \|
	\| q2_k_4_0 \| ![q2_k_4_0](Images/q2_k_4_0.png) \| ![q2_k_4_0](Images/1_q2_k_4_0.png) \| ![q2_k_4_0](Images/2_q2_k_4_0.png) \|
	\| q3_k_4_0 \| ![q3_k_4_0](Images/q3_k_4_0.png) \| ![q3_k_4_0](Images/1_q3_k_4_0.png) \| ![q3_k_4_0](Images/2_q3_k_4_0.png) \|
	\| q4_0 \| ![q4_0](Images/q4_0.png) \| ![q4_0](Images/1_q4_0.png) \| ![q4_0](Images/2_q4_0.png) \|
	\| q4_k_4_0 \| ![q4_k_4_0](Images/q4_k_4_0.png) \| ![q4_k_4_0](Images/1_q4_k_4_0.png) \| ![q4_k_4_0](Images/2_q4_k_4_0.png) \|
	\| q4_k_4_1 \| ![q4_k_4_1](Images/q4_k_4_1.png) \| ![q4_k_4_1](Images/1_q4_k_4_1.png) \| ![q4_k_4_1](Images/2_q4_k_4_1.png) \|
	\| q4_1 \| ![q4_1](Images/q4_1.png) \| ![q4_1](Images/1_q4_1.png) \| ![q4_1](Images/2_q4_1.png) \|
	\| q5_0 \| ![q5_0](Images/q5_0.png) \| ![q5_0](Images/1_q5_0.png) \| ![q5_0](Images/2_q5_0.png) \|

	only 28 steps, cfg scale 4.5

	Generated with a modified version of sdcpp with [this PR](https://github.com/leejet/stable-diffusion.cpp/pull/397) applied to enable clip timestep embeddings support.

	Text encoders used: q4_k quant of t5xxl, full precision clip_g, and q8 quant of [ViT-L-14-TEXT-detail-improved-hiT-GmP-TE-only-HF](https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14) in place of clip_l.

	Full prompts and settings in png metadata.