Llamacpp quants
Browse files- .gitattributes +11 -0
- Meta-Llama-3-70B-Instruct-IQ1_M.gguf +1 -1
- Meta-Llama-3-70B-Instruct-IQ2_M.gguf +1 -1
- Meta-Llama-3-70B-Instruct-IQ2_XS.gguf +1 -1
- Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf +1 -1
- Meta-Llama-3-70B-Instruct-IQ3_M.gguf +1 -1
- Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf +1 -1
- Meta-Llama-3-70B-Instruct-IQ4_XS.gguf +1 -1
- Meta-Llama-3-70B-Instruct-Q2_K.gguf +1 -1
- Meta-Llama-3-70B-Instruct-Q2_K_L.gguf +3 -0
- Meta-Llama-3-70B-Instruct-Q3_K_M.gguf +1 -1
- Meta-Llama-3-70B-Instruct-Q3_K_S.gguf +1 -1
- Meta-Llama-3-70B-Instruct-Q3_K_XL.gguf +3 -0
- Meta-Llama-3-70B-Instruct-Q4_K_L.gguf +3 -0
- Meta-Llama-3-70B-Instruct-Q4_K_M.gguf +1 -1
- Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf +3 -0
- Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00002-of-00002.gguf +3 -0
- Meta-Llama-3-70B-Instruct-Q5_K_M.gguf +1 -1
- Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00001-of-00002.gguf +2 -2
- Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00002-of-00002.gguf +2 -2
- Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00001-of-00002.gguf +3 -0
- Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00002-of-00002.gguf +3 -0
- Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00001-of-00004.gguf +3 -0
- Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00002-of-00004.gguf +3 -0
- Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00003-of-00004.gguf +3 -0
- Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00004-of-00004.gguf +3 -0
- Meta-Llama-3-70B-Instruct.imatrix +2 -2
- README.md +8 -16
.gitattributes
CHANGED
@@ -64,3 +64,14 @@ Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00003-of-0000
|
|
64 |
Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00004-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
|
65 |
Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00005-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
|
66 |
Meta-Llama-3-70B-Instruct.imatrix filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
64 |
Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00004-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
|
65 |
Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00005-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
|
66 |
Meta-Llama-3-70B-Instruct.imatrix filter=lfs diff=lfs merge=lfs -text
|
67 |
+
Meta-Llama-3-70B-Instruct-Q2_K_L.gguf filter=lfs diff=lfs merge=lfs -text
|
68 |
+
Meta-Llama-3-70B-Instruct-Q3_K_XL.gguf filter=lfs diff=lfs merge=lfs -text
|
69 |
+
Meta-Llama-3-70B-Instruct-Q4_K_L.gguf filter=lfs diff=lfs merge=lfs -text
|
70 |
+
Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
|
71 |
+
Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
|
72 |
+
Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
|
73 |
+
Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
|
74 |
+
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00001-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
|
75 |
+
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00002-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
|
76 |
+
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00003-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
|
77 |
+
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00004-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
|
Meta-Llama-3-70B-Instruct-IQ1_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 16751195936
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f071ec18bdf9bb94a3a64de2dc87043de6e643011a730de7b6b203df4223eb77
|
3 |
size 16751195936
|
Meta-Llama-3-70B-Instruct-IQ2_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 24119293728
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:725c685b990335fb1d8a1e52e00bb6d2c04a4e64d042ca1a224180e53e5e0d6b
|
3 |
size 24119293728
|
Meta-Llama-3-70B-Instruct-IQ2_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 21142107936
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8867a771b629b7abd85f68e9fcb51a20f3bf9eb6eade7dfcc9f8d4086ddd20ca
|
3 |
size 21142107936
|
Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 19097384736
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:dcf8e7fc9a03726dc709dae7dd53a634f2443da8fe363dc4f9a96951d240e6e4
|
3 |
size 19097384736
|
Meta-Llama-3-70B-Instruct-IQ3_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 31937034016
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:680b9f9b621158dfccdbcc953d52d0604366a609416f7fc72138b3631f24fe5e
|
3 |
size 31937034016
|
Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 27469494048
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:291c3238200415174755f0b56e816b96a336f1564dba63823314c0bd60198894
|
3 |
size 27469494048
|
Meta-Llama-3-70B-Instruct-IQ4_XS.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 37902661408
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b7c0821b45eacaed71d08d061712387880ad538273c32da5021463481921a758
|
3 |
size 37902661408
|
Meta-Llama-3-70B-Instruct-Q2_K.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 26375108384
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:987c41cf85b8ef3a44a4e5fedbaaa0c99f423664a9aacbd62b51c802e0362b6a
|
3 |
size 26375108384
|
Meta-Llama-3-70B-Instruct-Q2_K_L.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2d8eecf8d7cfb658e7aa4d86dafca1465fb06d56c640c69c332399b0ed485cc4
|
3 |
+
size 29371168544
|
Meta-Llama-3-70B-Instruct-Q3_K_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 34267494176
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fd96573523115fc36f7b69bfbf76e43041d62c944087cba288a1edd5650c598d
|
3 |
size 34267494176
|
Meta-Llama-3-70B-Instruct-Q3_K_S.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 30912050976
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:98ccf7da30df84414d38ebeee8c9076f2a7d0a13214e2545e5729208116e0da3
|
3 |
size 30912050976
|
Meta-Llama-3-70B-Instruct-Q3_K_XL.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7188d1ea9e9420947633d7548509740ea0acdd2e0f72242cd0c774cdd2d69361
|
3 |
+
size 40029943584
|
Meta-Llama-3-70B-Instruct-Q4_K_L.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:f0dd2ec40e95182d7e2ffda67bdf1f783c54058bfa4ff03aa264ec5ae89b4b0d
|
3 |
+
size 45270202144
|
Meta-Llama-3-70B-Instruct-Q4_K_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 42520393504
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7b602f72406b1cd28466f84506abe8ff0e67dd867937ff1c393a78de29b8d07e
|
3 |
size 42520393504
|
Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:49ab6a7e72037f4d74047bb701b45da44abbc33303e48144397dada03412861e
|
3 |
+
size 39993594592
|
Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00002-of-00002.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8634f4b38b4330103f02f7a492b24b69937e621e250654b119a0bf2a17194986
|
3 |
+
size 12574696736
|
Meta-Llama-3-70B-Instruct-Q5_K_M.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 49949816608
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7cb1c17c5b33a8071acf9a715eb4281d9b7ba34e1b8af5a660300d86cbfc8aee
|
3 |
size 49949816608
|
Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00001-of-00002.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:1dfcec26081dfc6547498b49266ff2e39880337ff8b85dd5380cbb385137380d
|
3 |
+
size 39862698784
|
Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00002-of-00002.gguf
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:8ac2b8d53e03d2926237214eb2cc15f1cbf92c2281d51bed9c756055e3f11a8c
|
3 |
+
size 18025444544
|
Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00001-of-00002.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c91ed0b016b6ce5601eeed5fd8108dd6383d4265dae6f7925a72b242a4630b8f
|
3 |
+
size 39808935904
|
Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00002-of-00002.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b9af1ecc2494fe11bf440621d8bc267d43bcdc8156f0799ff8cbc3231f59a958
|
3 |
+
size 35166113792
|
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00001-of-00004.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:07c993fc3c73977f0551af11a5b870237afd59bb66e02edaa51ec0b2f39722e3
|
3 |
+
size 39758724160
|
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00002-of-00004.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:135ecf44c93ae5d87967b5f526429a8faa11d8a9b9e07b61feb172ab89e3bbdc
|
3 |
+
size 39830630656
|
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00003-of-00004.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:29f38cec13ef711d326927bf292c68533440acac0d3460f2546a1f2eea69c5e9
|
3 |
+
size 39981592928
|
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00004-of-00004.gguf
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:54bb81ef0db1bb5de7b7e3647ceaf29b7279c43f37beb48c15a8bb25abba8319
|
3 |
+
size 21546965280
|
Meta-Llama-3-70B-Instruct.imatrix
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:b861753c5acd139deda1728cae732f93de5505bb64e37496c2e806865f061c51
|
3 |
+
size 24922299
|
README.md
CHANGED
@@ -8,9 +8,7 @@ tags:
|
|
8 |
- pytorch
|
9 |
- llama
|
10 |
- llama-3
|
11 |
-
license:
|
12 |
-
license_name: llama3
|
13 |
-
license_link: LICENSE
|
14 |
extra_gated_prompt: >-
|
15 |
### META LLAMA 3 COMMUNITY LICENSE AGREEMENT
|
16 |
|
@@ -209,11 +207,11 @@ quantized_by: bartowski
|
|
209 |
|
210 |
## Llamacpp imatrix Quantizations of Meta-Llama-3-70B-Instruct
|
211 |
|
212 |
-
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/
|
213 |
|
214 |
Original model: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct
|
215 |
|
216 |
-
All quants made using imatrix option with dataset
|
217 |
|
218 |
## Prompt format
|
219 |
|
@@ -233,26 +231,20 @@ All quants made using imatrix option with dataset provided by Kalomaze [here](ht
|
|
233 |
| -------- | ---------- | --------- | ----------- |
|
234 |
| [Meta-Llama-3-70B-Instruct-Q8_0.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q8_0.gguf) | Q8_0 | 74.97GB | Extremely high quality, generally unneeded but max available quant. |
|
235 |
| [Meta-Llama-3-70B-Instruct-Q6_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q6_K.gguf) | Q6_K | 57.88GB | Very high quality, near perfect, *recommended*. |
|
|
|
236 |
| [Meta-Llama-3-70B-Instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_M.gguf) | Q5_K_M | 49.94GB | High quality, *recommended*. |
|
237 |
-
| [Meta-Llama-3-70B-Instruct-
|
238 |
| [Meta-Llama-3-70B-Instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_M.gguf) | Q4_K_M | 42.52GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
|
239 |
-
| [Meta-Llama-3-70B-Instruct-Q4_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_S.gguf) | Q4_K_S | 40.34GB | Slightly lower quality with more space savings, *recommended*. |
|
240 |
-
| [Meta-Llama-3-70B-Instruct-IQ4_NL.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ4_NL.gguf) | IQ4_NL | 40.05GB | Decent quality, slightly smaller than Q4_K_S with similar performance *recommended*. |
|
241 |
| [Meta-Llama-3-70B-Instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ4_XS.gguf) | IQ4_XS | 37.90GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
|
242 |
-
| [Meta-Llama-3-70B-Instruct-Q3_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_L.gguf) | Q3_K_L | 37.14GB | Lower quality but usable, good for low RAM availability. |
|
243 |
| [Meta-Llama-3-70B-Instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_M.gguf) | Q3_K_M | 34.26GB | Even lower quality. |
|
244 |
| [Meta-Llama-3-70B-Instruct-IQ3_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_M.gguf) | IQ3_M | 31.93GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
|
245 |
-
| [Meta-Llama-3-70B-Instruct-IQ3_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_S.gguf) | IQ3_S | 30.91GB | Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance. |
|
246 |
| [Meta-Llama-3-70B-Instruct-Q3_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_S.gguf) | Q3_K_S | 30.91GB | Low quality, not recommended. |
|
247 |
-
| [Meta-Llama-3-70B-Instruct-IQ3_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_XS.gguf) | IQ3_XS | 29.30GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
|
248 |
| [Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf) | IQ3_XXS | 27.46GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
|
249 |
| [Meta-Llama-3-70B-Instruct-Q2_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q2_K.gguf) | Q2_K | 26.37GB | Very low quality but surprisingly usable. |
|
250 |
| [Meta-Llama-3-70B-Instruct-IQ2_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_M.gguf) | IQ2_M | 24.11GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
|
251 |
-
| [Meta-Llama-3-70B-Instruct-
|
252 |
-
| [Meta-Llama-3-70B-Instruct-IQ2_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XS.gguf) | IQ2_XS | 21.14GB | Very low quality, uses SOTA techniques to be usable. |
|
253 |
| [Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf) | IQ2_XXS | 19.09GB | Lower quality, uses SOTA techniques to be usable. |
|
254 |
| [Meta-Llama-3-70B-Instruct-IQ1_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ1_M.gguf) | IQ1_M | 16.75GB | Extremely low quality, *not* recommended. |
|
255 |
-
| [Meta-Llama-3-70B-Instruct-IQ1_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ1_S.gguf) | IQ1_S | 15.34GB | Extremely low quality, *not* recommended. |
|
256 |
|
257 |
## Downloading using huggingface-cli
|
258 |
|
@@ -265,13 +257,13 @@ pip install -U "huggingface_hub[cli]"
|
|
265 |
Then, you can target the specific file you want:
|
266 |
|
267 |
```
|
268 |
-
huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q4_K_M.gguf" --local-dir ./
|
269 |
```
|
270 |
|
271 |
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
|
272 |
|
273 |
```
|
274 |
-
huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q8_0.gguf/*" --local-dir Meta-Llama-3-70B-Instruct-Q8_0
|
275 |
```
|
276 |
|
277 |
You can either specify a new local-dir (Meta-Llama-3-70B-Instruct-Q8_0) or download them all in place (./)
|
|
|
8 |
- pytorch
|
9 |
- llama
|
10 |
- llama-3
|
11 |
+
license: llama3
|
|
|
|
|
12 |
extra_gated_prompt: >-
|
13 |
### META LLAMA 3 COMMUNITY LICENSE AGREEMENT
|
14 |
|
|
|
207 |
|
208 |
## Llamacpp imatrix Quantizations of Meta-Llama-3-70B-Instruct
|
209 |
|
210 |
+
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3259">b3259</a> for quantization.
|
211 |
|
212 |
Original model: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct
|
213 |
|
214 |
+
All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
|
215 |
|
216 |
## Prompt format
|
217 |
|
|
|
231 |
| -------- | ---------- | --------- | ----------- |
|
232 |
| [Meta-Llama-3-70B-Instruct-Q8_0.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q8_0.gguf) | Q8_0 | 74.97GB | Extremely high quality, generally unneeded but max available quant. |
|
233 |
| [Meta-Llama-3-70B-Instruct-Q6_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q6_K.gguf) | Q6_K | 57.88GB | Very high quality, near perfect, *recommended*. |
|
234 |
+
| [Meta-Llama-3-70B-Instruct-Q5_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q5_K_L.gguf) | Q5_K_L | 52.56GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. High quality, *recommended*. |
|
235 |
| [Meta-Llama-3-70B-Instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_M.gguf) | Q5_K_M | 49.94GB | High quality, *recommended*. |
|
236 |
+
| [Meta-Llama-3-70B-Instruct-Q4_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_L.gguf) | Q4_K_L | 45.27GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Good quality, uses about 4.83 bits per weight, *recommended*. |
|
237 |
| [Meta-Llama-3-70B-Instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_M.gguf) | Q4_K_M | 42.52GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
|
|
|
|
|
238 |
| [Meta-Llama-3-70B-Instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ4_XS.gguf) | IQ4_XS | 37.90GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
|
|
|
239 |
| [Meta-Llama-3-70B-Instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_M.gguf) | Q3_K_M | 34.26GB | Even lower quality. |
|
240 |
| [Meta-Llama-3-70B-Instruct-IQ3_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_M.gguf) | IQ3_M | 31.93GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
|
|
|
241 |
| [Meta-Llama-3-70B-Instruct-Q3_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_S.gguf) | Q3_K_S | 30.91GB | Low quality, not recommended. |
|
|
|
242 |
| [Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf) | IQ3_XXS | 27.46GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
|
243 |
| [Meta-Llama-3-70B-Instruct-Q2_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q2_K.gguf) | Q2_K | 26.37GB | Very low quality but surprisingly usable. |
|
244 |
| [Meta-Llama-3-70B-Instruct-IQ2_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_M.gguf) | IQ2_M | 24.11GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
|
245 |
+
| [Meta-Llama-3-70B-Instruct-IQ2_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XS.gguf) | IQ2_XS | 21.14GB | Lower quality, uses SOTA techniques to be usable. |
|
|
|
246 |
| [Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf) | IQ2_XXS | 19.09GB | Lower quality, uses SOTA techniques to be usable. |
|
247 |
| [Meta-Llama-3-70B-Instruct-IQ1_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ1_M.gguf) | IQ1_M | 16.75GB | Extremely low quality, *not* recommended. |
|
|
|
248 |
|
249 |
## Downloading using huggingface-cli
|
250 |
|
|
|
257 |
Then, you can target the specific file you want:
|
258 |
|
259 |
```
|
260 |
+
huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q4_K_M.gguf" --local-dir ./
|
261 |
```
|
262 |
|
263 |
If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
|
264 |
|
265 |
```
|
266 |
+
huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q8_0.gguf/*" --local-dir Meta-Llama-3-70B-Instruct-Q8_0
|
267 |
```
|
268 |
|
269 |
You can either specify a new local-dir (Meta-Llama-3-70B-Instruct-Q8_0) or download them all in place (./)
|