File size: 6,810 Bytes
ea8c102
 
 
 
 
 
 
 
afa4753
4c3ee51
 
 
ea8c102
 
 
 
83fe95e
310b031
ea8c102
 
310b031
 
21bd885
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ea8c102
0f4e18f
 
2cb32b9
c6a8363
0f4e18f
ea8c102
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---
tags:
- merge
- mergekit
- lazymergekit
- abideen/AlphaMonarch-dora
base_model:
- abideen/AlphaMonarch-dora
license: cc-by-nc-4.0
language:
- de
- en
---

# Spaetzle-v60-7b

This is a progressive (mostly dare-ties, but also slerp i.a.) merge with the intention of suitable compromise for English and German local tasks.

Spaetzle-v60-7b is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
* [abideen/AlphaMonarch-dora](https://huggingface.co/abideen/AlphaMonarch-dora)
* [cstr/Spaetzle-v58-7b](https://huggingface.co/cstr/Spaetzle-v58-7b)

## Benchmarks
The performance looks ok so far: e.g. we get in EQ-Bench: Score (v2_de): 65.08 (Parseable: 171.0).

From the [Occiglot Euro LLM Leaderboard](https://huggingface.co/spaces/occiglot/euro-llm-leaderboard):
| Model                                                  | DE    | EN    | ARC EN | TruthfulQA EN | Belebele EN | HellaSwag EN | MMLU EN | ARC DE | TruthfulQA DE | Belebele DE | HellaSwag DE | MMLU DE |
|--------------------------------------------------------|-------|-------|--------|---------------|-------------|--------------|---------|--------|---------------|-------------|--------------|---------|
| mistral-community/Mixtral-8x22B-v0.1                   | 66.81 | 72.87 | 70.56  | 52.29         | 93.89       | 70.41        | 77.17   | 63.9   | 29.31         | 92.44       | 77.9         | 70.49   |
| **cstr/Spaetzle-v60-7b**                                   | 60.95 | 71.65 | 69.88  | 66.24         | 90.11       | 68.43        | 63.59   | 58     | 37.31         | 84.22       | 70.09        | 55.11   |
| VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct         | 60.07 | 74.71 | 74.49  | 66.19         | 91.67       | 74.55        | 66.65   | 59.37  | 29.57         | 88.56       | 66.43        | 56.44   |
| occiglot/occiglot-7b-de-en-instruct                    | 56.65 | 61.7  | 60.41  | 49.38         | 81.22       | 60.43        | 57.06   | 54.49  | 31.09         | 77.22       | 68.84        | 51.59   |
| occiglot/occiglot-7b-de-en                             | 54.01 | 58.78 | 55.63  | 42.33         | 79.11       | 59.99        | 56.84   | 50.56  | 26.27         | 74.33       | 67.42        | 51.46   |
| meta-llama/Meta-Llama-3-8B                             | 53.89 | 63.08 | 58.02  | 43.87         | 86.44       | 61.75        | 65.3    | 46.45  | 24.24         | 81.11       | 62.48        | 55.18   |
| mistralai/Mistral-7B-Instruct-v0.2                     | 53.52 | 67.63 | 63.74  | 66.81         | 82.44       | 65.96        | 59.2    | 48.59  | 37.69         | 68.89       | 62.24        | 50.2    |
| occiglot/occiglot-7b-eu5-instruct                      | 53.15 | 57.78 | 55.89  | 44.9          | 74.67       | 59.92        | 53.51   | 52.95  | 28.68         | 66.78       | 68.52        | 48.82   |
| clibrain/lince-mistral-7b-it-es                        | 52.98 | 62.43 | 62.46  | 43.32         | 82.44       | 63.86        | 60.06   | 49.44  | 28.17         | 75          | 61.64        | 50.64   |
| mistralai/Mistral-7B-v0.1                              | 52.8  | 62.73 | 61.26  | 42.62         | 84.44       | 62.89        | 62.46   | 47.65  | 28.43         | 73.89       | 61.06        | 52.96   |
| LeoLM/leo-mistral-hessianai-7b                         | 51.78 | 56.11 | 52.22  | 42.92         | 73.67       | 57.86        | 53.88   | 47.48  | 25.25         | 69.11       | 68.21        | 48.83   |

And for the int4-inc quantized version, from [Low-bit Quantized Open LLM Leaderboard](https://huggingface.co/spaces/Intel/low_bit_open_llm_leaderboard):

| Type | Model                                     | Average ⬆️ | ARC-c | ARC-e | Boolq | HellaSwag | Lambada | MMLU  | Openbookqa | Piqa  | Truthfulqa | Winogrande | #Params (B) | #Size (G) |
|------|-------------------------------------------|------------|-------|-------|-------|-----------|---------|-------|------------|-------|------------|------------|-------------|-----------|
| πŸ’   | Intel/SOLAR-10.7B-Instruct-v1.0-int4-inc  | 68.49      | 60.49 | 82.66 | 88.29 | 68.29     | 73.36   | 62.43 | 35.6       | 80.74 | 56.06      | 76.95      | 10.57       | 5.98      |
| πŸ’   | **cstr/Spaetzle-v60-7b-int4-inc**         | **68.01**  | **62.12** | **85.27** | **87.34** | **66.43** | **70.58** | **61.39** | **37**  | **82.26** | **50.18** | **77.51**  | **7.04**  | **4.16**  |
| πŸ”·   | TheBloke/SOLAR-10.7B-Instruct-v1.0-GGUF   | 66.6       | 60.41 | 83.38 | 88.29 | 67.73     | 52.42   | 62.04 | 37.2       | 82.32 | 56.3       | 75.93      | 10.73       | 6.07      |
| πŸ”·   | cstr/Spaetzle-v60-7b-Q4_0-GGUF            | 66.44      | 61.35 | 85.19 | 87.98 | 66.54     | 52.78   | 62.05 | 40.6       | 81.72 | 47         | 79.16      | 7.24        | 4.11      |
| πŸ’   | Intel/Mistral-7B-Instruct-v0.2-int4-inc   | 65.73      | 55.38 | 81.44 | 85.26 | 65.67     | 70.89   | 58.66 | 34.2       | 80.74 | 51.16      | 73.95      | 7.04        | 4.16      |
| πŸ’   | Intel/Phi-3-mini-4k-instruct-int4-inc     | 65.09      | 57.08 | 83.33 | 86.18 | 59.45     | 68.14   | 66.62 | 38.6       | 79.33 | 38.68      | 73.48      | 3.66        | 2.28      |
| πŸ”·   | TheBloke/Mistral-7B-Instruct-v0.2-GGUF    | 63.52      | 53.5  | 77.9  | 85.44 | 66.9      | 50.11   | 58.45 | 38.8       | 77.58 | 53.12      | 73.4       | 7.24        | 4.11      |
| πŸ’   | Intel/Meta-Llama-3-8B-Instruct-int4-inc   | 62.93      | 51.88 | 81.1  | 83.21 | 57.09     | 71.32   | 62.41 | 35.2       | 78.62 | 36.35      | 72.14      | 7.2         | 5.4       |


Contamination check results (reference model: Mistral instruct 7b v0.1):
- MMLU: result < 0.1, %:  0.19
- TruthfulQA: result < 0.1, %:  0.34
- GSM8k: result < 0.1, %:  0.39

## 🧩 Configuration

```yaml
models:
  - model: cstr/Spaetzle-v58-7b
    # no parameters necessary for base model
  - model: abideen/AlphaMonarch-dora
    parameters:
      density: 0.60
      weight: 0.30
merge_method: dare_ties
base_model: cstr/Spaetzle-v58-7b
parameters:
  int8_mask: true
dtype: bfloat16
random_seed: 0
tokenizer_source: base

```

## πŸ’» Usage

```python
!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "cstr/Spaetzle-v60-7b"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```