salamandra-2b / quantization_results.md
robbiemu's picture
update for quantization
5dadba4
### Full Perplexity Comparison Table for Release Documentation
| **Quantization Type** | **PPL** | **ln(PPL(Q)/PPL(fp16))** | **File Size** |
|-----------------------|-----------|--------------------------|---------------|
| **bf16** | 14.0431 | 0.0 | 4.2G |
| **IQ2_XS** | 28.9052 | 0.72189 | 1.5G |
| **IQ3_M** | 15.1995 | 0.079131 | 1.7G |
| **IQ3_S** | 15.8627 | 0.121839 | 1.7G |
| **IQ3_XS** | 16.7197 | 0.174456 | 1.7G |
| **IQ3_XXS** | 17.6216 | 0.226994 | 1.7G |
| **IQ4_NL** | 14.5534 | 0.035693 | 1.9G |
| **IQ4_XS** | 14.5638 | 0.036408 | 1.8G |
| **Q3_K_L** | 15.0444 | 0.068875 | 1.8G |
| **Q3_K_M** | 15.2582 | 0.082986 | 1.8G |
| **Q3_K_S** | 15.839 | 0.120344 | 1.7G |
| **Q4_K_M** | 14.399 | 0.025028 | 2.0G |
| **Q4_K_S** | 14.4338 | 0.027442 | 1.9G |
| **Q5_K_M** | 14.1299 | 0.006162 | 2.2G |
| **Q5_K_S** | 14.1497 | 0.007562 | 2.1G |
| **Q6_K** | 14.0675 | 0.001736 | 2.4G |
| **Q8_0** | 14.0495 | 0.000456 | 2.7G |
---
This full table documents all the quantization types tested, showing their respective **Perplexity (PPL)**, **ln(PPL(Q)/PPL(fp16))**, and **file sizes**.