Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,8 @@ Original model: https://huggingface.co/google/gemma-2-9b-it
|
|
22 |
|
23 |
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
|
24 |
|
|
|
|
|
25 |
## What's new
|
26 |
|
27 |
- July 21 2024: Contains latest tokenizer fixes, which addressed a few oddities from the original fix, should be closest to correct performance yet. Also has metadata for SWA and logit softcapping.
|
|
|
22 |
|
23 |
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
|
24 |
|
25 |
+
Experimental quants are made with `--output-tensor-type f16 --token-embedding-type f16` per [ZeroWw](https://huggingface.co/ZeroWw)'s suggestion, please provide any feedback on quality differences you spot.
|
26 |
+
|
27 |
## What's new
|
28 |
|
29 |
- July 21 2024: Contains latest tokenizer fixes, which addressed a few oddities from the original fix, should be closest to correct performance yet. Also has metadata for SWA and logit softcapping.
|