mradermacher commited on
Commit
ab951ef
1 Parent(s): f0c542d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -39,7 +39,7 @@ The only fix seems to be to delete the repo, which unfortunately also deletes th
39
 
40
  The quant types I currently do regularly are:
41
 
42
- - static: (F16) Q8_0 Q4_K_S Q2_K Q6_K Q3_K_M Q3_K_S Q3_K_L Q4_K_M Q5_K_S Q5_K_M IQ4_XS
43
  - imatrix: Q2_K Q4_K_S IQ3_XXS Q3_K_M Q4_K_M IQ2_M Q6_K IQ4_XS Q3_K_S Q3_K_L Q5_K_S Q5_K_M Q4_0 IQ3_XS IQ3_S IQ3_M IQ2_XXS IQ2_XS IQ2_S IQ1_M IQ1_S (Q4_0_4_4 Q4_0_4_8 Q4_0_8_8)
44
 
45
  And they are generally (but not always) generated in the order above, for which there are deep reasons.
@@ -48,18 +48,18 @@ For models less than 11B size, I experimentally generate f16 versions at the mom
48
 
49
  For models less than 15B in size, the "arm only" Q4_0_4_4 static and Q4_0_4_4/Q4_0_4_8/Q4_0_8_8 imatrix quants will be generated.
50
 
51
- Older models that pre-date introduction of new quant types generally will have them retrofitted, hopefully
52
- this year. At least when multiple quant types are missing, as it is hard to justify a big mdoel download
53
- for just one quant. If you want a quant form the above list and don't want to wait, feel free to request it and I will
54
- prioritize it to the best of my abilities.
55
-
56
  The (static) IQ3 quants are no longer generated, as they consistently seem to result in *much* lower quality
57
  quants than even static Q2_K, so it would be s disservice to offer them.
58
 
59
  I specifically do not do Q2_K_S, because I generally think it is not worth it, and IQ4_NL, because it requires
60
  a lot of computing and is generally completely superseded by IQ4_XS.
61
 
62
- You can always try to change my mind.
 
 
 
 
 
63
 
64
  ## What does the "-i1" mean in "-i1-GGUF"?
65
 
 
39
 
40
  The quant types I currently do regularly are:
41
 
42
+ - static: (f16) Q8_0 Q4_K_S Q2_K Q6_K Q3_K_M Q3_K_S Q3_K_L Q4_K_M Q5_K_S Q5_K_M IQ4_XS (Q4_0_4)
43
  - imatrix: Q2_K Q4_K_S IQ3_XXS Q3_K_M Q4_K_M IQ2_M Q6_K IQ4_XS Q3_K_S Q3_K_L Q5_K_S Q5_K_M Q4_0 IQ3_XS IQ3_S IQ3_M IQ2_XXS IQ2_XS IQ2_S IQ1_M IQ1_S (Q4_0_4_4 Q4_0_4_8 Q4_0_8_8)
44
 
45
  And they are generally (but not always) generated in the order above, for which there are deep reasons.
 
48
 
49
  For models less than 15B in size, the "arm only" Q4_0_4_4 static and Q4_0_4_4/Q4_0_4_8/Q4_0_8_8 imatrix quants will be generated.
50
 
 
 
 
 
 
51
  The (static) IQ3 quants are no longer generated, as they consistently seem to result in *much* lower quality
52
  quants than even static Q2_K, so it would be s disservice to offer them.
53
 
54
  I specifically do not do Q2_K_S, because I generally think it is not worth it, and IQ4_NL, because it requires
55
  a lot of computing and is generally completely superseded by IQ4_XS.
56
 
57
+ Q8_0 imatrix quants do not exist - some quanters claim otherwise, but Q8_0 ggufs do not contain any tensor
58
+ type that uses the imatrix data, although technically it might be possible to do so.
59
+
60
+ Older models that pre-date introduction of new quant types generally will have them retrofitted on request.
61
+
62
+ You can always try to change my mind about all this, but be prepared to bring convincing data.
63
 
64
  ## What does the "-i1" mean in "-i1-GGUF"?
65