Llamacpp quants
Browse files- .gitattributes +2 -0
- README.md +4 -1
- gemma-2-9b-it-IQ2_M.gguf +2 -2
- gemma-2-9b-it-IQ2_S.gguf +2 -2
- gemma-2-9b-it-IQ2_XS.gguf +2 -2
- gemma-2-9b-it-IQ3_M.gguf +2 -2
- gemma-2-9b-it-IQ3_XS.gguf +2 -2
- gemma-2-9b-it-IQ3_XXS.gguf +2 -2
- gemma-2-9b-it-IQ4_XS.gguf +2 -2
- gemma-2-9b-it-Q2_K.gguf +2 -2
- gemma-2-9b-it-Q2_K_L.gguf +3 -0
- gemma-2-9b-it-Q3_K_L.gguf +2 -2
- gemma-2-9b-it-Q3_K_M.gguf +2 -2
- gemma-2-9b-it-Q3_K_S.gguf +2 -2
- gemma-2-9b-it-Q3_K_XL.gguf +3 -0
- gemma-2-9b-it-Q4_K_L.gguf +2 -2
- gemma-2-9b-it-Q4_K_M.gguf +2 -2
- gemma-2-9b-it-Q4_K_S.gguf +2 -2
- gemma-2-9b-it-Q5_K_L.gguf +2 -2
- gemma-2-9b-it-Q5_K_M.gguf +2 -2
- gemma-2-9b-it-Q5_K_S.gguf +2 -2
- gemma-2-9b-it-Q6_K.gguf +2 -2
- gemma-2-9b-it-Q6_K_L.gguf +2 -2
- gemma-2-9b-it-Q8_0.gguf +2 -2
- gemma-2-9b-it-Q8_0_L.gguf +2 -2
- gemma-2-9b-it-f32.gguf +2 -2
- gemma-2-9b-it.imatrix +1 -1
.gitattributes
CHANGED
@@ -56,3 +56,5 @@ gemma-2-9b-it-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
|
56 |
gemma-2-9b-it-Q8_0_L.gguf filter=lfs diff=lfs merge=lfs -text
|
57 |
gemma-2-9b-it-f32.gguf filter=lfs diff=lfs merge=lfs -text
|
58 |
gemma-2-9b-it.imatrix filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
56 |
gemma-2-9b-it-Q8_0_L.gguf filter=lfs diff=lfs merge=lfs -text
|
57 |
gemma-2-9b-it-f32.gguf filter=lfs diff=lfs merge=lfs -text
|
58 |
gemma-2-9b-it.imatrix filter=lfs diff=lfs merge=lfs -text
|
59 |
+
gemma-2-9b-it-Q2_K_L.gguf filter=lfs diff=lfs merge=lfs -text
|
60 |
+
gemma-2-9b-it-Q3_K_XL.gguf filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
@@ -27,6 +27,8 @@ All quants made using imatrix option with dataset from [here](https://gist.githu
|
|
27 |
<bos><start_of_turn>user
|
28 |
{prompt}<end_of_turn>
|
29 |
<start_of_turn>model
|
|
|
|
|
30 |
|
31 |
```
|
32 |
|
@@ -40,13 +42,14 @@ Note that this model does not support a System prompt.
|
|
40 |
| [gemma-2-9b-it-Q8_0.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q8_0.gguf) | Q8_0 | 9.82GB | Extremely high quality, generally unneeded but max available quant. |
|
41 |
| [gemma-2-9b-it-Q6_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q6_K_L.gguf) | Q6_K_L | 8.67GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Very high quality, near perfect, *recommended*. |
|
42 |
| [gemma-2-9b-it-Q6_K.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q6_K.gguf) | Q6_K | 7.58GB | Very high quality, near perfect, *recommended*. |
|
43 |
-
| [gemma-2-9b-it-Q5_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_L.gguf) | Q5_K_L | 7.
|
44 |
| [gemma-2-9b-it-Q5_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_M.gguf) | Q5_K_M | 6.64GB | High quality, *recommended*. |
|
45 |
| [gemma-2-9b-it-Q5_K_S.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_S.gguf) | Q5_K_S | 6.48GB | High quality, *recommended*. |
|
46 |
| [gemma-2-9b-it-Q4_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_L.gguf) | Q4_K_L | 6.84GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Good quality, uses about 4.83 bits per weight, *recommended*. |
|
47 |
| [gemma-2-9b-it-Q4_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_M.gguf) | Q4_K_M | 5.76GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
|
48 |
| [gemma-2-9b-it-Q4_K_S.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_S.gguf) | Q4_K_S | 5.47GB | Slightly lower quality with more space savings, *recommended*. |
|
49 |
| [gemma-2-9b-it-IQ4_XS.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-IQ4_XS.gguf) | IQ4_XS | 5.18GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
|
|
|
50 |
| [gemma-2-9b-it-Q3_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_L.gguf) | Q3_K_L | 5.13GB | Lower quality but usable, good for low RAM availability. |
|
51 |
| [gemma-2-9b-it-Q3_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_M.gguf) | Q3_K_M | 4.76GB | Even lower quality. |
|
52 |
| [gemma-2-9b-it-IQ3_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-IQ3_M.gguf) | IQ3_M | 4.49GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
|
|
|
27 |
<bos><start_of_turn>user
|
28 |
{prompt}<end_of_turn>
|
29 |
<start_of_turn>model
|
30 |
+
<end_of_turn>
|
31 |
+
<start_of_turn>model
|
32 |
|
33 |
```
|
34 |
|
|
|
42 |
| [gemma-2-9b-it-Q8_0.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q8_0.gguf) | Q8_0 | 9.82GB | Extremely high quality, generally unneeded but max available quant. |
|
43 |
| [gemma-2-9b-it-Q6_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q6_K_L.gguf) | Q6_K_L | 8.67GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Very high quality, near perfect, *recommended*. |
|
44 |
| [gemma-2-9b-it-Q6_K.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q6_K.gguf) | Q6_K | 7.58GB | Very high quality, near perfect, *recommended*. |
|
45 |
+
| [gemma-2-9b-it-Q5_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_L.gguf) | Q5_K_L | 7.72GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. High quality, *recommended*. |
|
46 |
| [gemma-2-9b-it-Q5_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_M.gguf) | Q5_K_M | 6.64GB | High quality, *recommended*. |
|
47 |
| [gemma-2-9b-it-Q5_K_S.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q5_K_S.gguf) | Q5_K_S | 6.48GB | High quality, *recommended*. |
|
48 |
| [gemma-2-9b-it-Q4_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_L.gguf) | Q4_K_L | 6.84GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Good quality, uses about 4.83 bits per weight, *recommended*. |
|
49 |
| [gemma-2-9b-it-Q4_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_M.gguf) | Q4_K_M | 5.76GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
|
50 |
| [gemma-2-9b-it-Q4_K_S.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_S.gguf) | Q4_K_S | 5.47GB | Slightly lower quality with more space savings, *recommended*. |
|
51 |
| [gemma-2-9b-it-IQ4_XS.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-IQ4_XS.gguf) | IQ4_XS | 5.18GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
|
52 |
+
| [gemma-2-9b-it-Q3_K_XL.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_XL.gguf) | Q3_K_XL | 6.21GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Lower quality but usable, good for low RAM availability. |
|
53 |
| [gemma-2-9b-it-Q3_K_L.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_L.gguf) | Q3_K_L | 5.13GB | Lower quality but usable, good for low RAM availability. |
|
54 |
| [gemma-2-9b-it-Q3_K_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q3_K_M.gguf) | Q3_K_M | 4.76GB | Even lower quality. |
|
55 |
| [gemma-2-9b-it-IQ3_M.gguf](https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-IQ3_M.gguf) | IQ3_M | 4.49GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
|
gemma-2-9b-it-IQ2_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d15fdae729a79d5be3224dfc91ca1c3e36ca5c1b2b45f58d21a5b6ffd0b4f218
|
3 |
+
size 3434669824
|
gemma-2-9b-it-IQ2_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:6a8d6e35d9486aeb911874e0191d236989b219b2aff28624cd79f2fa1a0adada
|
3 |
+
size 3211486976
|
gemma-2-9b-it-IQ2_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:519b7c3c3dec688972028bbaa3d1ceeb57219d2401b40606817806b192234b88
|
3 |
+
size 3067381504
|
gemma-2-9b-it-IQ3_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:52887543e6b86c6c3b3e0809f93903d2c7c480ed75870b12fee2fd0f47c95747
|
3 |
+
size 4494616320
|
gemma-2-9b-it-IQ3_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:ae768bece7b9a5fe4c6050ed56c5d18259ebbac3e469d72d339d7d6eccd570f5
|
3 |
+
size 4144989952
|
gemma-2-9b-it-IQ3_XXS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:01757277da17951371c892938f2f6a9962d95f478dff735a586d4e9fdb3f98f4
|
3 |
+
size 3796739840
|
gemma-2-9b-it-IQ4_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d245892642033bd773aa58de1c56e42665ece1f7f7c0ec44dcfe3a96b7d9651e
|
3 |
+
size 5183031040
|
gemma-2-9b-it-Q2_K.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:18a16f7a5aeec0b980b4de59b5e1360230ae1c8adfd134d1767c9e7e11d98e6e
|
3 |
+
size 3805398784
|
gemma-2-9b-it-Q2_K_L.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:808f8acb636b47d4a4f3e98c1fff54a71eb6fce8a089f81c3fe2fe393fb78617
|
3 |
+
size 4887766784
|
gemma-2-9b-it-Q3_K_L.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fc77fafad18312c3f3d0d316e2edefd28dcd539cba10cc1fbe7f0dc3d53dae6d
|
3 |
+
size 5132453632
|
gemma-2-9b-it-Q3_K_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9ff6aebb809a52bf5560d23c4fda7e96ee9a3a75cf7ab0e20ab3089017020645
|
3 |
+
size 4761782016
|
gemma-2-9b-it-Q3_K_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1866bcc45b83bebacef1cf9daf09bc94036a2705afbf8eaf9369f9bc6006209e
|
3 |
+
size 4337665792
|
gemma-2-9b-it-Q3_K_XL.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:e3c84c98a66aae4b79d92903f1cab526a158315c26a2851cdb6a6174720300a7
|
3 |
+
size 6214821632
|
gemma-2-9b-it-Q4_K_L.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2fee521f3b0f96aa358c0da9d5a8ac18d1ee81426059599ab7b15e9f06c3dc49
|
3 |
+
size 6843426560
|
gemma-2-9b-it-Q4_K_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5375972196fae34c1a767bbeba93938d86abb39f2f91ea5453efa36ead6569f1
|
3 |
+
size 5761058560
|
gemma-2-9b-it-Q4_K_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:225432bb66c374da3a94ed5d5c1ff0e6b80d742a8a09281a58e2fff8e9efa72e
|
3 |
+
size 5478926080
|
gemma-2-9b-it-Q5_K_L.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:807d53fe895f460ccdb7817557965852a73886df2026c101457de9c7e3a038ad
|
3 |
+
size 7729735424
|
gemma-2-9b-it-Q5_K_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:78f480cb36e05fedbae67e097840cd71999dde890d57287f4205a331a0d5cefe
|
3 |
+
size 6647367424
|
gemma-2-9b-it-Q5_K_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:975f87d0482f74adedeadddeabb68d87b6202da0e7e237687215c6c69b43b91b
|
3 |
+
size 6483592960
|
gemma-2-9b-it-Q6_K.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:edc2b9f3f811cb78101d618a2db360ca374584fbdb8540afae869a6fffaa6516
|
3 |
+
size 7589070592
|
gemma-2-9b-it-Q6_K_L.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:af575649b23922300468189a26e381951b298bdb18046f7aeb4fcc63bc30a5d6
|
3 |
+
size 8671438592
|
gemma-2-9b-it-Q8_0.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9f9a9de67bec3d6e8277c1964c278aa419c9ed7533cefe6595a8ee4e9c568d01
|
3 |
+
size 9827149568
|
gemma-2-9b-it-Q8_0_L.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:61707731042388bfcf35eadbddfc0a812df783d41bf541de231ba5cda4775347
|
3 |
+
size 10687309568
|
gemma-2-9b-it-f32.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fbf05a8f90685a86b8b92c8de0b5777f3346a976f496123f36da8585c3177362
|
3 |
+
size 36972881408
|
gemma-2-9b-it.imatrix
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 6116901
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8a2ec42f9516ace90f9ecb98781eef3db3b63040319ed9192ea3cf8782ebc454
|
3 |
size 6116901
|