File size: 5,038 Bytes
8a1134e
4338de8
bdc6f5a
 
 
 
8a1134e
bdc6f5a
 
8a1134e
66cea99
 
4237ab0
 
f6bebe9
66cea99
5e689bd
66cea99
 
 
 
 
 
f6bebe9
 
62cd9a9
 
f6bebe9
 
5e689bd
8b8c43e
cfdd2f0
 
3fee3db
df4f108
cfdd2f0
170a808
14c4784
cfdd2f0
170a808
cfdd2f0
14c4784
cfdd2f0
 
8b8c43e
1b3abe8
019e9f1
849dd49
170a808
6a13beb
df4f108
6a13beb
f6bebe9
66cea99
d75ce17
 
62cd9a9
d75ce17
62cd9a9
8948e2e
 
 
4338de8
 
 
 
 
a24601e
 
 
 
 
 
f6bebe9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
base_model: cognitivecomputations/Samantha-1.1-70b
datasets:
- ehartford/samantha-data
language:
- en
library_name: transformers
license: llama2
quantized_by: mradermacher
---
## About

weighted/imatrix quants of https://huggingface.co/cognitivecomputations/Samantha-1.1-70b

The weights were calculated using 164k semi-random english tokens.

<!-- provided-files -->
## Usage

If you are unsure how to use GGUF files, refer to one of [TheBloke's
READMEs](https://huggingface.co/TheBloke/KafkaLM-70B-German-V0.1-GGUF) for
more details, including on how to concatenate multi-part files.

## Provided Quants

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

| Link | Type | Size/GB | Notes |
|:-----|:-----|--------:|:------|
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ1_S.gguf) | i1-IQ1_S | 15.0 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ1_M.gguf) | i1-IQ1_M | 16.4 | for the desperate |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ2_XXS.gguf) | i1-IQ2_XXS | 18.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ2_XS.gguf) | i1-IQ2_XS | 20.8 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ2_S.gguf) | i1-IQ2_S | 21.8 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ2_M.gguf) | i1-IQ2_M | 23.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q2_K.gguf) | i1-Q2_K | 25.9 | IQ3_XXS probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ3_XXS.gguf) | i1-IQ3_XXS | 27.4 | lower quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ3_XS.gguf) | i1-IQ3_XS | 28.6 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q3_K_XS.gguf) | i1-Q3_K_XS | 28.7 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ3_S.gguf) | i1-IQ3_S | 30.3 | beats Q3_K* |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q3_K_S.gguf) | i1-Q3_K_S | 30.3 | IQ3_XS probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ3_M.gguf) | i1-IQ3_M | 31.4 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q3_K_M.gguf) | i1-Q3_K_M | 33.7 | IQ3_S probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q3_K_L.gguf) | i1-Q3_K_L | 36.6 | IQ3_M probably better |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ4_XS.gguf) | i1-IQ4_XS | 37.2 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-IQ4_NL.gguf) | i1-IQ4_NL | 39.4 | slightly worse than Q4_K_S |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q4_0.gguf) | i1-Q4_0 | 39.4 | fast, low quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q4_K_S.gguf) | i1-Q4_K_S | 39.7 | optimal size/speed/quality |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q4_K_M.gguf) | i1-Q4_K_M | 41.8 | fast, recommended |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q5_K_S.gguf) | i1-Q5_K_S | 47.9 |  |
| [GGUF](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q5_K_M.gguf) | i1-Q5_K_M | 49.2 |  |
| [PART 1](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q6_K.gguf.part1of2) [PART 2](https://huggingface.co/mradermacher/Samantha-1.1-70b-i1-GGUF/resolve/main/Samantha-1.1-70b.i1-Q6_K.gguf.part2of2) | i1-Q6_K | 57.0 | practically like static Q6_K |


Here is a handy graph by ikawrakow comparing some lower-quality quant
types (lower is better):

![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)

And here are Artefact2's thoughts on the matter:
https://gist.github.com/Artefact2/b5f810600771265fc1e39442288e8ec9

## FAQ / Model Request

See https://huggingface.co/mradermacher/model_requests for some answers to
questions you might have and/or if you want some other model quantized.

## Thanks

I thank my company, [nethype GmbH](https://www.nethype.de/), for letting
me use its servers and providing upgrades to my workstation to enable
this work in my free time.

<!-- end -->