Llamacpp quants

Files changed (28) hide show

.gitattributes +11 -0
Meta-Llama-3-70B-Instruct-IQ1_M.gguf +1 -1
Meta-Llama-3-70B-Instruct-IQ2_M.gguf +1 -1
Meta-Llama-3-70B-Instruct-IQ2_XS.gguf +1 -1
Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf +1 -1
Meta-Llama-3-70B-Instruct-IQ3_M.gguf +1 -1
Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf +1 -1
Meta-Llama-3-70B-Instruct-IQ4_XS.gguf +1 -1
Meta-Llama-3-70B-Instruct-Q2_K.gguf +1 -1
Meta-Llama-3-70B-Instruct-Q2_K_L.gguf +3 -0
Meta-Llama-3-70B-Instruct-Q3_K_M.gguf +1 -1
Meta-Llama-3-70B-Instruct-Q3_K_S.gguf +1 -1
Meta-Llama-3-70B-Instruct-Q3_K_XL.gguf +3 -0
Meta-Llama-3-70B-Instruct-Q4_K_L.gguf +3 -0
Meta-Llama-3-70B-Instruct-Q4_K_M.gguf +1 -1
Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf +3 -0
Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00002-of-00002.gguf +3 -0
Meta-Llama-3-70B-Instruct-Q5_K_M.gguf +1 -1
Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00001-of-00002.gguf +2 -2
Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00002-of-00002.gguf +2 -2
Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00001-of-00002.gguf +3 -0
Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00002-of-00002.gguf +3 -0
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00001-of-00004.gguf +3 -0
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00002-of-00004.gguf +3 -0
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00003-of-00004.gguf +3 -0
Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00004-of-00004.gguf +3 -0
Meta-Llama-3-70B-Instruct.imatrix +2 -2
README.md +8 -16

.gitattributes CHANGED Viewed

@@ -64,3 +64,14 @@ Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00003-of-0000
 Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00004-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
 Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00005-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
 Meta-Llama-3-70B-Instruct.imatrix filter=lfs diff=lfs merge=lfs -text

 Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00004-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
 Meta-Llama-3-70B-Instruct-fp16.gguf/Meta-Llama-3-70B-Instruct-fp16-00005-of-00005.gguf filter=lfs diff=lfs merge=lfs -text
 Meta-Llama-3-70B-Instruct.imatrix filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-Q2_K_L.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-Q3_K_XL.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-Q4_K_L.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00001-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00002-of-00002.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00001-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00002-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00003-of-00004.gguf filter=lfs diff=lfs merge=lfs -text
+Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00004-of-00004.gguf filter=lfs diff=lfs merge=lfs -text

Meta-Llama-3-70B-Instruct-IQ1_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c59effcfec3292e466a938d66801c4ae0b4ae7806f90dde3b190617fb9fe84d6
 size 16751195936

 version https://git-lfs.github.com/spec/v1
+oid sha256:f071ec18bdf9bb94a3a64de2dc87043de6e643011a730de7b6b203df4223eb77
 size 16751195936

Meta-Llama-3-70B-Instruct-IQ2_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:190bbc50f4f5e95193400872618813063c5d538bbd7d7bd8df89adc12ac8d29d
 size 24119293728

 version https://git-lfs.github.com/spec/v1
+oid sha256:725c685b990335fb1d8a1e52e00bb6d2c04a4e64d042ca1a224180e53e5e0d6b
 size 24119293728

Meta-Llama-3-70B-Instruct-IQ2_XS.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6738130235f2bda476b7e499f9d9036971a9b0a1ce0b9b9798946307f8f38cdd
 size 21142107936

 version https://git-lfs.github.com/spec/v1
+oid sha256:8867a771b629b7abd85f68e9fcb51a20f3bf9eb6eade7dfcc9f8d4086ddd20ca
 size 21142107936

Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fe41e8a16c0c756a178581e7b343bd9b1275fb6fb8ab13bd9e0ea3e35535ffd8
 size 19097384736

 version https://git-lfs.github.com/spec/v1
+oid sha256:dcf8e7fc9a03726dc709dae7dd53a634f2443da8fe363dc4f9a96951d240e6e4
 size 19097384736

Meta-Llama-3-70B-Instruct-IQ3_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ffe9f3e1f42c7cea2a6abbba2e82ee743ccea88588e98b3dd697f625ec42f2e3
 size 31937034016

 version https://git-lfs.github.com/spec/v1
+oid sha256:680b9f9b621158dfccdbcc953d52d0604366a609416f7fc72138b3631f24fe5e
 size 31937034016

Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9889fbcf9b8b109010153c3681c5e39072ca6f83c426a6725182791ebc506659
 size 27469494048

 version https://git-lfs.github.com/spec/v1
+oid sha256:291c3238200415174755f0b56e816b96a336f1564dba63823314c0bd60198894
 size 27469494048

Meta-Llama-3-70B-Instruct-IQ4_XS.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1f72ecf45d624a03f04dbf7390fb5b3b1084b2e8a31dd3c4f8d1bb888ff151b2
 size 37902661408

 version https://git-lfs.github.com/spec/v1
+oid sha256:b7c0821b45eacaed71d08d061712387880ad538273c32da5021463481921a758
 size 37902661408

Meta-Llama-3-70B-Instruct-Q2_K.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5235341296d2b33566a339ab36ba2f83c45cbb8ac3551ba575130b07316b73a8
 size 26375108384

 version https://git-lfs.github.com/spec/v1
+oid sha256:987c41cf85b8ef3a44a4e5fedbaaa0c99f423664a9aacbd62b51c802e0362b6a
 size 26375108384

Meta-Llama-3-70B-Instruct-Q2_K_L.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2d8eecf8d7cfb658e7aa4d86dafca1465fb06d56c640c69c332399b0ed485cc4
+size 29371168544

Meta-Llama-3-70B-Instruct-Q3_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c5f3632c7dc4dec225939c0362742ebaab5110e22599f11b7a79606a25ea52bf
 size 34267494176

 version https://git-lfs.github.com/spec/v1
+oid sha256:fd96573523115fc36f7b69bfbf76e43041d62c944087cba288a1edd5650c598d
 size 34267494176

Meta-Llama-3-70B-Instruct-Q3_K_S.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7d324ea99786adefa63fef55c929c92f0567958cd1e78bffe2fe6ca19d8a00ef
 size 30912050976

 version https://git-lfs.github.com/spec/v1
+oid sha256:98ccf7da30df84414d38ebeee8c9076f2a7d0a13214e2545e5729208116e0da3
 size 30912050976

Meta-Llama-3-70B-Instruct-Q3_K_XL.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7188d1ea9e9420947633d7548509740ea0acdd2e0f72242cd0c774cdd2d69361
+size 40029943584

Meta-Llama-3-70B-Instruct-Q4_K_L.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f0dd2ec40e95182d7e2ffda67bdf1f783c54058bfa4ff03aa264ec5ae89b4b0d
+size 45270202144

Meta-Llama-3-70B-Instruct-Q4_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:30d3c22063a2f97191a5ba714b9a40b36866eb3b340ed964cec1d302008b1e39
 size 42520393504

 version https://git-lfs.github.com/spec/v1
+oid sha256:7b602f72406b1cd28466f84506abe8ff0e67dd867937ff1c393a78de29b8d07e
 size 42520393504

Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00001-of-00002.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:49ab6a7e72037f4d74047bb701b45da44abbc33303e48144397dada03412861e
+size 39993594592

Meta-Llama-3-70B-Instruct-Q5_K_L.gguf/Meta-Llama-3-70B-Instruct-Q5_K_L-00002-of-00002.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8634f4b38b4330103f02f7a492b24b69937e621e250654b119a0bf2a17194986
+size 12574696736

Meta-Llama-3-70B-Instruct-Q5_K_M.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1c90220d69cec3fd849d9f915af794583e688db16484cab839237a9306f21a06
 size 49949816608

 version https://git-lfs.github.com/spec/v1
+oid sha256:7cb1c17c5b33a8071acf9a715eb4281d9b7ba34e1b8af5a660300d86cbfc8aee
 size 49949816608

Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00001-of-00002.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:319a64df51fd3c9196ac02c081ee928fb877a0b1c6e84b2ab974f80f1cb0cafc
-size 32141175808

 version https://git-lfs.github.com/spec/v1
+oid sha256:1dfcec26081dfc6547498b49266ff2e39880337ff8b85dd5380cbb385137380d
+size 39862698784

Meta-Llama-3-70B-Instruct-Q6_K.gguf/Meta-Llama-3-70B-Instruct-Q6_K-00002-of-00002.gguf CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b31f40134a5bb3611f9eb53c6c9e15e942a472c65199025aa0b2dda1f55068b6
-size 25746967520

 version https://git-lfs.github.com/spec/v1
+oid sha256:8ac2b8d53e03d2926237214eb2cc15f1cbf92c2281d51bed9c756055e3f11a8c
+size 18025444544

Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00001-of-00002.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c91ed0b016b6ce5601eeed5fd8108dd6383d4265dae6f7925a72b242a4630b8f
+size 39808935904

Meta-Llama-3-70B-Instruct-Q8_0.gguf/Meta-Llama-3-70B-Instruct-Q8_0-00002-of-00002.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b9af1ecc2494fe11bf440621d8bc267d43bcdc8156f0799ff8cbc3231f59a958
+size 35166113792

Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00001-of-00004.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:07c993fc3c73977f0551af11a5b870237afd59bb66e02edaa51ec0b2f39722e3
+size 39758724160

Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00002-of-00004.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:135ecf44c93ae5d87967b5f526429a8faa11d8a9b9e07b61feb172ab89e3bbdc
+size 39830630656

Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00003-of-00004.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:29f38cec13ef711d326927bf292c68533440acac0d3460f2546a1f2eea69c5e9
+size 39981592928

Meta-Llama-3-70B-Instruct-f16.gguf/Meta-Llama-3-70B-Instruct-f16-00004-of-00004.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:54bb81ef0db1bb5de7b7e3647ceaf29b7279c43f37beb48c15a8bb25abba8319
+size 21546965280

Meta-Llama-3-70B-Instruct.imatrix CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cc30db5582c459705521a6c493a0394ba48ed9fced9182319e1904dbd8a28b9d
-size 24922295

 version https://git-lfs.github.com/spec/v1
+oid sha256:b861753c5acd139deda1728cae732f93de5505bb64e37496c2e806865f061c51
+size 24922299

README.md CHANGED Viewed

@@ -8,9 +8,7 @@ tags:
 - pytorch
 - llama
 - llama-3
-license: other
-license_name: llama3
-license_link: LICENSE
 extra_gated_prompt: >-
   ### META LLAMA 3 COMMUNITY LICENSE AGREEMENT
@@ -209,11 +207,11 @@ quantized_by: bartowski
 ## Llamacpp imatrix Quantizations of Meta-Llama-3-70B-Instruct
-Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b2777">b2777</a> for quantization.
 Original model: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct
-All quants made using imatrix option with dataset provided by Kalomaze [here](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
 ## Prompt format
@@ -233,26 +231,20 @@ All quants made using imatrix option with dataset provided by Kalomaze [here](ht
 | -------- | ---------- | --------- | ----------- |
 | [Meta-Llama-3-70B-Instruct-Q8_0.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q8_0.gguf) | Q8_0 | 74.97GB | Extremely high quality, generally unneeded but max available quant. |
 | [Meta-Llama-3-70B-Instruct-Q6_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q6_K.gguf) | Q6_K | 57.88GB | Very high quality, near perfect, *recommended*. |
 | [Meta-Llama-3-70B-Instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_M.gguf) | Q5_K_M | 49.94GB | High quality, *recommended*. |
-| [Meta-Llama-3-70B-Instruct-Q5_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_S.gguf) | Q5_K_S | 48.65GB | High quality, *recommended*. |
 | [Meta-Llama-3-70B-Instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_M.gguf) | Q4_K_M | 42.52GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
-| [Meta-Llama-3-70B-Instruct-Q4_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_S.gguf) | Q4_K_S | 40.34GB | Slightly lower quality with more space savings, *recommended*. |
-| [Meta-Llama-3-70B-Instruct-IQ4_NL.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ4_NL.gguf) | IQ4_NL | 40.05GB | Decent quality, slightly smaller than Q4_K_S with similar performance *recommended*. |
 | [Meta-Llama-3-70B-Instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ4_XS.gguf) | IQ4_XS | 37.90GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
-| [Meta-Llama-3-70B-Instruct-Q3_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_L.gguf) | Q3_K_L | 37.14GB | Lower quality but usable, good for low RAM availability. |
 | [Meta-Llama-3-70B-Instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_M.gguf) | Q3_K_M | 34.26GB | Even lower quality. |
 | [Meta-Llama-3-70B-Instruct-IQ3_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_M.gguf) | IQ3_M | 31.93GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
-| [Meta-Llama-3-70B-Instruct-IQ3_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_S.gguf) | IQ3_S | 30.91GB | Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance. |
 | [Meta-Llama-3-70B-Instruct-Q3_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_S.gguf) | Q3_K_S | 30.91GB | Low quality, not recommended. |
-| [Meta-Llama-3-70B-Instruct-IQ3_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_XS.gguf) | IQ3_XS | 29.30GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
 | [Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf) | IQ3_XXS | 27.46GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
 | [Meta-Llama-3-70B-Instruct-Q2_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q2_K.gguf) | Q2_K | 26.37GB | Very low quality but surprisingly usable. |
 | [Meta-Llama-3-70B-Instruct-IQ2_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_M.gguf) | IQ2_M | 24.11GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
-| [Meta-Llama-3-70B-Instruct-IQ2_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_S.gguf) | IQ2_S | 22.24GB | Very low quality, uses SOTA techniques to be usable. |
-| [Meta-Llama-3-70B-Instruct-IQ2_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XS.gguf) | IQ2_XS | 21.14GB | Very low quality, uses SOTA techniques to be usable. |
 | [Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf) | IQ2_XXS | 19.09GB | Lower quality, uses SOTA techniques to be usable. |
 | [Meta-Llama-3-70B-Instruct-IQ1_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ1_M.gguf) | IQ1_M | 16.75GB | Extremely low quality, *not* recommended. |
-| [Meta-Llama-3-70B-Instruct-IQ1_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ1_S.gguf) | IQ1_S | 15.34GB | Extremely low quality, *not* recommended. |
 ## Downloading using huggingface-cli
@@ -265,13 +257,13 @@ pip install -U "huggingface_hub[cli]"
 Then, you can target the specific file you want:
 ```
-huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q4_K_M.gguf" --local-dir ./ --local-dir-use-symlinks False
 ```
 If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
 ```
-huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q8_0.gguf/*" --local-dir Meta-Llama-3-70B-Instruct-Q8_0 --local-dir-use-symlinks False
 ```
 You can either specify a new local-dir (Meta-Llama-3-70B-Instruct-Q8_0) or download them all in place (./)

 - pytorch
 - llama
 - llama-3
+license: llama3
 extra_gated_prompt: >-
   ### META LLAMA 3 COMMUNITY LICENSE AGREEMENT
 ## Llamacpp imatrix Quantizations of Meta-Llama-3-70B-Instruct
+Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3259">b3259</a> for quantization.
 Original model: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct
+All quants made using imatrix option with dataset from [here](https://gist.github.com/bartowski1182/eb213dccb3571f863da82e99418f81e8)
 ## Prompt format
 | -------- | ---------- | --------- | ----------- |
 | [Meta-Llama-3-70B-Instruct-Q8_0.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q8_0.gguf) | Q8_0 | 74.97GB | Extremely high quality, generally unneeded but max available quant. |
 | [Meta-Llama-3-70B-Instruct-Q6_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q6_K.gguf) | Q6_K | 57.88GB | Very high quality, near perfect, *recommended*. |
+| [Meta-Llama-3-70B-Instruct-Q5_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/tree/main/Meta-Llama-3-70B-Instruct-Q5_K_L.gguf) | Q5_K_L | 52.56GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. High quality, *recommended*. |
 | [Meta-Llama-3-70B-Instruct-Q5_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q5_K_M.gguf) | Q5_K_M | 49.94GB | High quality, *recommended*. |
+| [Meta-Llama-3-70B-Instruct-Q4_K_L.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_L.gguf) | Q4_K_L | 45.27GB | *Experimental*, uses f16 for embed and output weights. Please provide any feedback of differences. Good quality, uses about 4.83 bits per weight, *recommended*. |
 | [Meta-Llama-3-70B-Instruct-Q4_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q4_K_M.gguf) | Q4_K_M | 42.52GB | Good quality, uses about 4.83 bits per weight, *recommended*. |
 | [Meta-Llama-3-70B-Instruct-IQ4_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ4_XS.gguf) | IQ4_XS | 37.90GB | Decent quality, smaller than Q4_K_S with similar performance, *recommended*. |
 | [Meta-Llama-3-70B-Instruct-Q3_K_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_M.gguf) | Q3_K_M | 34.26GB | Even lower quality. |
 | [Meta-Llama-3-70B-Instruct-IQ3_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_M.gguf) | IQ3_M | 31.93GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
 | [Meta-Llama-3-70B-Instruct-Q3_K_S.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q3_K_S.gguf) | Q3_K_S | 30.91GB | Low quality, not recommended. |
 | [Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf) | IQ3_XXS | 27.46GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
 | [Meta-Llama-3-70B-Instruct-Q2_K.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-Q2_K.gguf) | Q2_K | 26.37GB | Very low quality but surprisingly usable. |
 | [Meta-Llama-3-70B-Instruct-IQ2_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_M.gguf) | IQ2_M | 24.11GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
+| [Meta-Llama-3-70B-Instruct-IQ2_XS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XS.gguf) | IQ2_XS | 21.14GB | Lower quality, uses SOTA techniques to be usable. |
 | [Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf) | IQ2_XXS | 19.09GB | Lower quality, uses SOTA techniques to be usable. |
 | [Meta-Llama-3-70B-Instruct-IQ1_M.gguf](https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-GGUF/blob/main/Meta-Llama-3-70B-Instruct-IQ1_M.gguf) | IQ1_M | 16.75GB | Extremely low quality, *not* recommended. |
 ## Downloading using huggingface-cli
 Then, you can target the specific file you want:
 ```
+huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q4_K_M.gguf" --local-dir ./
 ```
 If the model is bigger than 50GB, it will have been split into multiple files. In order to download them all to a local folder, run:
 ```
+huggingface-cli download bartowski/Meta-Llama-3-70B-Instruct-GGUF --include "Meta-Llama-3-70B-Instruct-Q8_0.gguf/*" --local-dir Meta-Llama-3-70B-Instruct-Q8_0
 ```
 You can either specify a new local-dir (Meta-Llama-3-70B-Instruct-Q8_0) or download them all in place (./)