File size: 1,762 Bytes
5dadba4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
### Full Perplexity Comparison Table for Release Documentation

| **Quantization Type** | **PPL**   | **ln(PPL(Q)/PPL(fp16))** | **File Size** |
|-----------------------|-----------|--------------------------|---------------|
| **bf16**              | 14.0431   | 0.0                      | 4.2G          |
| **IQ2_XS**            | 28.9052   | 0.72189                  | 1.5G          |
| **IQ3_M**             | 15.1995   | 0.079131                 | 1.7G          |
| **IQ3_S**             | 15.8627   | 0.121839                 | 1.7G          |
| **IQ3_XS**            | 16.7197   | 0.174456                 | 1.7G          |
| **IQ3_XXS**           | 17.6216   | 0.226994                 | 1.7G          |
| **IQ4_NL**            | 14.5534   | 0.035693                 | 1.9G          |
| **IQ4_XS**            | 14.5638   | 0.036408                 | 1.8G          |
| **Q3_K_L**            | 15.0444   | 0.068875                 | 1.8G          |
| **Q3_K_M**            | 15.2582   | 0.082986                 | 1.8G          |
| **Q3_K_S**            | 15.839    | 0.120344                 | 1.7G          |
| **Q4_K_M**            | 14.399    | 0.025028                 | 2.0G          |
| **Q4_K_S**            | 14.4338   | 0.027442                 | 1.9G          |
| **Q5_K_M**            | 14.1299   | 0.006162                 | 2.2G          |
| **Q5_K_S**            | 14.1497   | 0.007562                 | 2.1G          |
| **Q6_K**              | 14.0675   | 0.001736                 | 2.4G          |
| **Q8_0**              | 14.0495   | 0.000456                 | 2.7G          |

---

This full table documents all the quantization types tested, showing their respective **Perplexity (PPL)**, **ln(PPL(Q)/PPL(fp16))**, and **file sizes**.