salamandra-2b-instruct / quanization_results.md
robbiemu's picture
files
e15c783
### Full Perplexity Comparison Table for Release Documentation
| Quantization Type | PPL(Q) | ln(PPL(Q)/PPL(fp16)) | File Size (G) |
|-------------------|---------|---------------------|---------------|
| IQ2_S | 25.3893 | 0.501266 | 1.6 |
| IQ2_M | 21.6684 | 0.342794 | 1.6 |
| Q3_K_M | 16.8567 | 0.091687 | 1.8 |
| IQ3_M | 16.774 | 0.086769 | 1.7 |
| Q3_K_L | 16.5067 | 0.070705 | 1.8 |
| IQ4_NL | 15.9602 | 0.037037 | 1.9 |
| IQ4_XS | 15.9591 | 0.036968 | 1.8 |
| Q4_K_S | 15.9346 | 0.035431 | 1.9 |
| Q4_K_M | 15.8651 | 0.031060 | 2.0 |
| Q5_K_S | 15.4901 | 0.007140 | 2.1 |
| Q5_K_M | 15.4746 | 0.006139 | 2.2 |
| Q6_K | 15.3961 | 0.001053 | 2.4 |
| Q8_0 | 15.3831 | 0.000208 | 2.7 |
| bf16 | 15.3799 | 0.000000 | 4.2 |
---
This full table documents all the quantization types tested, showing their respective **Perplexity (PPL)**, **ln(PPL(Q)/PPL(fp16))**, and **file sizes**.