Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ library_name: transformers
|
|
7 |
pipeline_tag: text-generation
|
8 |
---
|
9 |
|
10 |
-
![SauerkrautLM](images/
|
11 |
## VAGO solutions SauerkrautLM
|
12 |
Introducing SauerkrautLM-v1 - Your German Language Powerhouse!
|
13 |
|
@@ -37,9 +37,9 @@ Data augmentation techniques were used to grant grammatical, syntactical correct
|
|
37 |
|
38 |
**Merge Procedure:**
|
39 |
|
40 |
-
SauerkrautLM-7b-HerO was merged on 1 A100 with mergekit.
|
41 |
-
The merged model contains [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) and [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca)
|
42 |
-
We used the gradient
|
43 |
|
44 |
|
45 |
- **Model Type:** SauerkrautLM-7b-HerO is an auto-regressive language model based on the transformer architecture
|
@@ -111,8 +111,33 @@ Please tell me about how merged models can benefit from existent top-models.<|im
|
|
111 |
| | |ter | 0.6463|± |0.0039|
|
112 |
|xnli_de | 0|acc | 0.4547|± |0.0070|
|
113 |
|xnli_en | 0|acc | 0.5595|± |0.0070|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
114 |
```
|
115 |
|
|
|
116 |
## Disclaimer
|
117 |
We must inform users that despite our best efforts in data cleansing, the possibility of some such content slipping through cannot be entirely ruled out.
|
118 |
However, we cannot guarantee consistently appropriate behavior. Therefore, if you encounter any issues or come across inappropriate content, we kindly request that you inform us through the contact information provided.
|
|
|
7 |
pipeline_tag: text-generation
|
8 |
---
|
9 |
|
10 |
+
![SauerkrautLM](images/hero-multi.png "SauerkrautLM-7b-HerO-multilingual")
|
11 |
## VAGO solutions SauerkrautLM
|
12 |
Introducing SauerkrautLM-v1 - Your German Language Powerhouse!
|
13 |
|
|
|
37 |
|
38 |
**Merge Procedure:**
|
39 |
|
40 |
+
SauerkrautLM-7b-HerO was merged on 1 A100 with [mergekit](https://github.com/cg123/mergekit).
|
41 |
+
The merged model contains [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) and [Open-Orca/Mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca).
|
42 |
+
We used the gradient SLURP method.
|
43 |
|
44 |
|
45 |
- **Model Type:** SauerkrautLM-7b-HerO is an auto-regressive language model based on the transformer architecture
|
|
|
111 |
| | |ter | 0.6463|± |0.0039|
|
112 |
|xnli_de | 0|acc | 0.4547|± |0.0070|
|
113 |
|xnli_en | 0|acc | 0.5595|± |0.0070|
|
114 |
+
```
|
115 |
+
**BBH**
|
116 |
+
```
|
117 |
+
| Task |Version| Metric |Value | |Stderr|
|
118 |
+
|------------------------------------------------|------:|---------------------|-----:|---|-----:|
|
119 |
+
|bigbench_causal_judgement | 0|multiple_choice_grade|0.6053|± |0.0356|
|
120 |
+
|bigbench_date_understanding | 0|multiple_choice_grade|0.6992|± |0.0239|
|
121 |
+
|bigbench_disambiguation_qa | 0|multiple_choice_grade|0.3721|± |0.0302|
|
122 |
+
|bigbench_geometric_shapes | 0|multiple_choice_grade|0.1671|± |0.0197|
|
123 |
+
| | |exact_str_match |0.1003|± |0.0159|
|
124 |
+
|bigbench_logical_deduction_five_objects | 0|multiple_choice_grade|0.2540|± |0.0195|
|
125 |
+
|bigbench_logical_deduction_seven_objects | 0|multiple_choice_grade|0.2043|± |0.0152|
|
126 |
+
|bigbench_logical_deduction_three_objects | 0|multiple_choice_grade|0.4667|± |0.0289|
|
127 |
+
|bigbench_movie_recommendation | 0|multiple_choice_grade|0.3700|± |0.0216|
|
128 |
+
|bigbench_navigate | 0|multiple_choice_grade|0.4970|± |0.0158|
|
129 |
+
|bigbench_reasoning_about_colored_objects | 0|multiple_choice_grade|0.6965|± |0.0103|
|
130 |
+
|bigbench_ruin_names | 0|multiple_choice_grade|0.4152|± |0.0233|
|
131 |
+
|bigbench_salient_translation_error_detection | 0|multiple_choice_grade|0.1443|± |0.0111|
|
132 |
+
|bigbench_snarks | 0|multiple_choice_grade|0.6464|± |0.0356|
|
133 |
+
|bigbench_sports_understanding | 0|multiple_choice_grade|0.6846|± |0.0148|
|
134 |
+
|bigbench_temporal_sequences | 0|multiple_choice_grade|0.3150|± |0.0147|
|
135 |
+
|bigbench_tracking_shuffled_objects_five_objects | 0|multiple_choice_grade|0.2168|± |0.0117|
|
136 |
+
|bigbench_tracking_shuffled_objects_seven_objects| 0|multiple_choice_grade|0.1537|± |0.0086|
|
137 |
+
|bigbench_tracking_shuffled_objects_three_objects| 0|multiple_choice_grade|0.4667|± |0.0289|
|
138 |
```
|
139 |
|
140 |
+
|
141 |
## Disclaimer
|
142 |
We must inform users that despite our best efforts in data cleansing, the possibility of some such content slipping through cannot be entirely ruled out.
|
143 |
However, we cannot guarantee consistently appropriate behavior. Therefore, if you encounter any issues or come across inappropriate content, we kindly request that you inform us through the contact information provided.
|