Update README.md
Browse filesThanks for the additional quants, [DAN™](https://huggingface.co/dranger003), [Knut Jägersberg](https://huggingface.co/KnutJaegersberg), and [Michael Radermacher](https://huggingface.co/mradermacher)!
README.md
CHANGED
@@ -18,7 +18,11 @@ tags:
|
|
18 |
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6303ca537373aacccd85d8a7/vmCAhJCpF0dITtCVxlYET.jpeg)
|
19 |
|
20 |
- HF: [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0)
|
21 |
-
- GGUF: [
|
|
|
|
|
|
|
|
|
22 |
- EXL2: 2.4bpw | [2.65bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-2.65bpw-h6-exl2) | [3.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-3.0bpw-h6-exl2) | [3.5bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-3.5bpw-h6-exl2) | [4.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-4.0bpw-h6-exl2) | [5.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-5.0bpw-h6-exl2)
|
23 |
- **Max Context w/ 48 GB VRAM:** (24 GB VRAM is not enough, even for 2.4bpw, use [GGUF](https://huggingface.co/wolfram/miquliz-120b-v2.0-GGUF) instead!)
|
24 |
- **2.4bpw:** 32K (32768 tokens) w/ 8-bit cache, 21K (21504 tokens) w/o 8-bit cache
|
@@ -31,7 +35,7 @@ Inspired by [goliath-120b](https://huggingface.co/alpindale/goliath-120b).
|
|
31 |
|
32 |
Thanks for the support, [CopilotKit](https://github.com/CopilotKit/CopilotKit) – the open-source platform for building in-app AI Copilots into any product, with any LLM model. Check out their GitHub.
|
33 |
|
34 |
-
Thanks for the additional quants, [DAN™](https://huggingface.co/dranger003)!
|
35 |
|
36 |
Also available: [miqu-1-120b](https://huggingface.co/wolfram/miqu-1-120b) – Miquliz's older, purer sister; only Miqu, inflated to 120B.
|
37 |
|
|
|
18 |
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6303ca537373aacccd85d8a7/vmCAhJCpF0dITtCVxlYET.jpeg)
|
19 |
|
20 |
- HF: [wolfram/miquliz-120b-v2.0](https://huggingface.co/wolfram/miquliz-120b-v2.0)
|
21 |
+
- GGUF: [Q2_K | IQ3_XXS | Q4_K_M | Q5_K_M](https://huggingface.co/wolfram/miquliz-120b-v2.0-GGUF)
|
22 |
+
- [dranger003's IQ2_XS | IQ2_XXS | IQ3_XXS | Q8_0](https://huggingface.co/dranger003/miquliz-120b-v2.0-iMat.GGUF)
|
23 |
+
- [KnutJaegersberg's IQ2_XS](https://huggingface.co/KnutJaegersberg/2-bit-LLMs)
|
24 |
+
- [mradermacher's i1-IQ1_S – i1-Q5_K_M](https://huggingface.co/mradermacher/miquliz-120b-v2.0-i1-GGUF)
|
25 |
+
- [mradermacher's Q2_K – Q8_0](https://huggingface.co/mradermacher/miquliz-120b-v2.0-GGUF)
|
26 |
- EXL2: 2.4bpw | [2.65bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-2.65bpw-h6-exl2) | [3.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-3.0bpw-h6-exl2) | [3.5bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-3.5bpw-h6-exl2) | [4.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-4.0bpw-h6-exl2) | [5.0bpw](https://huggingface.co/wolfram/miquliz-120b-v2.0-5.0bpw-h6-exl2)
|
27 |
- **Max Context w/ 48 GB VRAM:** (24 GB VRAM is not enough, even for 2.4bpw, use [GGUF](https://huggingface.co/wolfram/miquliz-120b-v2.0-GGUF) instead!)
|
28 |
- **2.4bpw:** 32K (32768 tokens) w/ 8-bit cache, 21K (21504 tokens) w/o 8-bit cache
|
|
|
35 |
|
36 |
Thanks for the support, [CopilotKit](https://github.com/CopilotKit/CopilotKit) – the open-source platform for building in-app AI Copilots into any product, with any LLM model. Check out their GitHub.
|
37 |
|
38 |
+
Thanks for the additional quants, [DAN™](https://huggingface.co/dranger003), [Knut Jägersberg](https://huggingface.co/KnutJaegersberg), and [Michael Radermacher](https://huggingface.co/mradermacher)!
|
39 |
|
40 |
Also available: [miqu-1-120b](https://huggingface.co/wolfram/miqu-1-120b) – Miquliz's older, purer sister; only Miqu, inflated to 120B.
|
41 |
|