Spaetzle-v60-7b

This is a progressive (mostly dare-ties, but also slerp i.a.) merge with the intention of suitable compromise for English and German local tasks.

Spaetzle-v60-7b is a merge of the following models using LazyMergekit:

Benchmarks

The performance looks ok so far: e.g. we get in EQ-Bench: Score (v2_de): 65.08 (Parseable: 171.0).

From the Occiglot Euro LLM Leaderboard:

Model	DE	EN	ARC EN	TruthfulQA EN	Belebele EN	HellaSwag EN	MMLU EN	ARC DE	TruthfulQA DE	Belebele DE	HellaSwag DE	MMLU DE
mistral-community/Mixtral-8x22B-v0.1	66.81	72.87	70.56	52.29	93.89	70.41	77.17	63.9	29.31	92.44	77.9	70.49
cstr/Spaetzle-v60-7b	60.95	71.65	69.88	66.24	90.11	68.43	63.59	58	37.31	84.22	70.09	55.11
VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct	60.07	74.71	74.49	66.19	91.67	74.55	66.65	59.37	29.57	88.56	66.43	56.44
occiglot/occiglot-7b-de-en-instruct	56.65	61.7	60.41	49.38	81.22	60.43	57.06	54.49	31.09	77.22	68.84	51.59
occiglot/occiglot-7b-de-en	54.01	58.78	55.63	42.33	79.11	59.99	56.84	50.56	26.27	74.33	67.42	51.46
meta-llama/Meta-Llama-3-8B	53.89	63.08	58.02	43.87	86.44	61.75	65.3	46.45	24.24	81.11	62.48	55.18
mistralai/Mistral-7B-Instruct-v0.2	53.52	67.63	63.74	66.81	82.44	65.96	59.2	48.59	37.69	68.89	62.24	50.2
occiglot/occiglot-7b-eu5-instruct	53.15	57.78	55.89	44.9	74.67	59.92	53.51	52.95	28.68	66.78	68.52	48.82
clibrain/lince-mistral-7b-it-es	52.98	62.43	62.46	43.32	82.44	63.86	60.06	49.44	28.17	75	61.64	50.64
mistralai/Mistral-7B-v0.1	52.8	62.73	61.26	42.62	84.44	62.89	62.46	47.65	28.43	73.89	61.06	52.96
LeoLM/leo-mistral-hessianai-7b	51.78	56.11	52.22	42.92	73.67	57.86	53.88	47.48	25.25	69.11	68.21	48.83

And for the int4-inc quantized version, from Low-bit Quantized Open LLM Leaderboard:

Type	Model	Average ⬆️	ARC-c	ARC-e	Boolq	HellaSwag	Lambada	MMLU	Openbookqa	Piqa	Truthfulqa	Winogrande	#Params (B)	#Size (G)
🍒	Intel/SOLAR-10.7B-Instruct-v1.0-int4-inc	68.49	60.49	82.66	88.29	68.29	73.36	62.43	35.6	80.74	56.06	76.95	10.57	5.98
🍒	cstr/Spaetzle-v60-7b-int4-inc	68.01	62.12	85.27	87.34	66.43	70.58	61.39	37	82.26	50.18	77.51	7.04	4.16
🔷	TheBloke/SOLAR-10.7B-Instruct-v1.0-GGUF	66.6	60.41	83.38	88.29	67.73	52.42	62.04	37.2	82.32	56.3	75.93	10.73	6.07
🔷	cstr/Spaetzle-v60-7b-Q4_0-GGUF	66.44	61.35	85.19	87.98	66.54	52.78	62.05	40.6	81.72	47	79.16	7.24	4.11
🍒	Intel/Mistral-7B-Instruct-v0.2-int4-inc	65.73	55.38	81.44	85.26	65.67	70.89	58.66	34.2	80.74	51.16	73.95	7.04	4.16
🍒	Intel/Phi-3-mini-4k-instruct-int4-inc	65.09	57.08	83.33	86.18	59.45	68.14	66.62	38.6	79.33	38.68	73.48	3.66	2.28
🔷	TheBloke/Mistral-7B-Instruct-v0.2-GGUF	63.52	53.5	77.9	85.44	66.9	50.11	58.45	38.8	77.58	53.12	73.4	7.24	4.11
🍒	Intel/Meta-Llama-3-8B-Instruct-int4-inc	62.93	51.88	81.1	83.21	57.09	71.32	62.41	35.2	78.62	36.35	72.14	7.2	5.4

Contamination check results (reference model: Mistral instruct 7b v0.1):

MMLU: result < 0.1, %: 0.19
TruthfulQA: result < 0.1, %: 0.34
GSM8k: result < 0.1, %: 0.39

🧩 Configuration

models:
  - model: cstr/Spaetzle-v58-7b
    # no parameters necessary for base model
  - model: abideen/AlphaMonarch-dora
    parameters:
      density: 0.60
      weight: 0.30
merge_method: dare_ties
base_model: cstr/Spaetzle-v58-7b
parameters:
  int8_mask: true
dtype: bfloat16
random_seed: 0
tokenizer_source: base

💻 Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "cstr/Spaetzle-v60-7b"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

cstr
/

Spaetzle-v60-7b

Spaetzle-v60-7b

Benchmarks

🧩 Configuration

💻 Usage

Model tree for cstr/Spaetzle-v60-7b

Spaces using cstr/Spaetzle-v60-7b 5

Collection including cstr/Spaetzle-v60-7b

Spaetzle