Update README.md
Browse files
README.md
CHANGED
@@ -19,6 +19,8 @@ LoRA trained in 4-bit with 2k context using [LumiOpen/Viking-7B](https://hugging
|
|
19 |
|
20 |
Dataset used is [mpasila/Finnish-Alpaca-Small](https://huggingface.co/datasets/mpasila/Finnish-Alpaca-Small).
|
21 |
|
|
|
|
|
22 |
### Prompt format: Alpaca
|
23 |
It uses Alpaca format but with a translated instruction at the start:
|
24 |
```
|
@@ -32,8 +34,8 @@ It uses Alpaca format but with a translated instruction at the start:
|
|
32 |
|
33 |
| Model | Size | Type | FIN-bench (score) |
|
34 |
|-------|------|------|-------|
|
35 |
-
| **mpasila/Finnish-Alpaca-Small-7B** | 7B | Instruct |
|
36 |
-
| [mpasila/Finnish-Alpaca-Tiny-V2-7B](https://huggingface.co/mpasila/Finnish-Alpaca-Tiny-V2-7B) | 7B | Instruct | 0.4654 |
|
37 |
| [mpasila/Alpacazord-Viking-7B](https://huggingface.co/mpasila/Alpacazord-Viking-7B) | 7B | Instruct | 0.4123 |
|
38 |
| [mpasila/NordicAlpaca-Finnish-V1-7B](https://huggingface.co/mpasila/NordicAlpaca-Finnish-V1-7B) | 7B | Instruct | 0.3891 |
|
39 |
| [mpasila/Finnish-Viking-Alpaca-V1-7B](https://huggingface.co/mpasila/Finnish-Viking-Alpaca-V1-7B) | 7B | Instruct | 0.3943 |
|
@@ -47,6 +49,44 @@ It uses Alpaca format but with a translated instruction at the start:
|
|
47 |
|
48 |
#### FIN-bench scores:
|
49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
|
51 |
# Uploaded model
|
52 |
|
|
|
19 |
|
20 |
Dataset used is [mpasila/Finnish-Alpaca-Small](https://huggingface.co/datasets/mpasila/Finnish-Alpaca-Small).
|
21 |
|
22 |
+
Re-trained because I have no idea if I used the fully trained model or the partially trained model (of Viking-7B), since it apparently was just released. (After re-training the score lowered noticeably so I wonder if I screwed up something.)
|
23 |
+
|
24 |
### Prompt format: Alpaca
|
25 |
It uses Alpaca format but with a translated instruction at the start:
|
26 |
```
|
|
|
34 |
|
35 |
| Model | Size | Type | FIN-bench (score) |
|
36 |
|-------|------|------|-------|
|
37 |
+
| **mpasila/Finnish-Alpaca-Small-7B** | 7B | Instruct | 0.3586 |
|
38 |
+
| [mpasila/Finnish-Alpaca-Tiny-V2-7B](https://huggingface.co/mpasila/Finnish-Alpaca-Tiny-V2-7B) | 7B | Instruct | **0.4654** |
|
39 |
| [mpasila/Alpacazord-Viking-7B](https://huggingface.co/mpasila/Alpacazord-Viking-7B) | 7B | Instruct | 0.4123 |
|
40 |
| [mpasila/NordicAlpaca-Finnish-V1-7B](https://huggingface.co/mpasila/NordicAlpaca-Finnish-V1-7B) | 7B | Instruct | 0.3891 |
|
41 |
| [mpasila/Finnish-Viking-Alpaca-V1-7B](https://huggingface.co/mpasila/Finnish-Viking-Alpaca-V1-7B) | 7B | Instruct | 0.3943 |
|
|
|
49 |
|
50 |
#### FIN-bench scores:
|
51 |
|
52 |
+
| Task |Version| Metric |Value | |Stderr|
|
53 |
+
|------------------------------------------------|------:|---------------------|-----:|---|-----:|
|
54 |
+
|bigbench_analogies | 0|multiple_choice_grade|0.5923|± |0.0433|
|
55 |
+
|bigbench_arithmetic_1_digit_addition | 0|multiple_choice_grade|0.2700|± |0.0446|
|
56 |
+
|bigbench_arithmetic_1_digit_division | 0|multiple_choice_grade|0.4783|± |0.1065|
|
57 |
+
|bigbench_arithmetic_1_digit_multiplication | 0|multiple_choice_grade|0.2600|± |0.0441|
|
58 |
+
|bigbench_arithmetic_1_digit_subtraction | 0|multiple_choice_grade|0.2200|± |0.0416|
|
59 |
+
|bigbench_arithmetic_2_digit_addition | 0|multiple_choice_grade|0.1700|± |0.0378|
|
60 |
+
|bigbench_arithmetic_2_digit_division | 0|multiple_choice_grade|0.3600|± |0.0482|
|
61 |
+
|bigbench_arithmetic_2_digit_multiplication | 0|multiple_choice_grade|0.2000|± |0.0402|
|
62 |
+
|bigbench_arithmetic_2_digit_subtraction | 0|multiple_choice_grade|0.1300|± |0.0338|
|
63 |
+
|bigbench_arithmetic_3_digit_addition | 0|multiple_choice_grade|0.3100|± |0.0465|
|
64 |
+
|bigbench_arithmetic_3_digit_division | 0|multiple_choice_grade|0.2100|± |0.0409|
|
65 |
+
|bigbench_arithmetic_3_digit_multiplication | 0|multiple_choice_grade|0.1600|± |0.0368|
|
66 |
+
|bigbench_arithmetic_3_digit_subtraction | 0|multiple_choice_grade|0.2300|± |0.0423|
|
67 |
+
|bigbench_arithmetic_4_digit_addition | 0|multiple_choice_grade|0.3900|± |0.0490|
|
68 |
+
|bigbench_arithmetic_4_digit_division | 0|multiple_choice_grade|0.2300|± |0.0423|
|
69 |
+
|bigbench_arithmetic_4_digit_multiplication | 0|multiple_choice_grade|0.2100|± |0.0409|
|
70 |
+
|bigbench_arithmetic_4_digit_subtraction | 0|multiple_choice_grade|0.4500|± |0.0500|
|
71 |
+
|bigbench_arithmetic_5_digit_addition | 0|multiple_choice_grade|0.4800|± |0.0502|
|
72 |
+
|bigbench_arithmetic_5_digit_division | 0|multiple_choice_grade|0.0700|± |0.0256|
|
73 |
+
|bigbench_arithmetic_5_digit_multiplication | 0|multiple_choice_grade|0.1700|± |0.0378|
|
74 |
+
|bigbench_arithmetic_5_digit_subtraction | 0|multiple_choice_grade|0.5800|± |0.0496|
|
75 |
+
|bigbench_cause_and_effect_one_sentence | 0|multiple_choice_grade|0.6275|± |0.0684|
|
76 |
+
|bigbench_cause_and_effect_one_sentence_no_prompt| 0|multiple_choice_grade|0.6667|± |0.0667|
|
77 |
+
|bigbench_cause_and_effect_two_sentences | 0|multiple_choice_grade|0.5098|± |0.0707|
|
78 |
+
|bigbench_emotions | 0|multiple_choice_grade|0.3312|± |0.0373|
|
79 |
+
|bigbench_empirical_judgments | 0|multiple_choice_grade|0.3333|± |0.0476|
|
80 |
+
|bigbench_general_knowledge | 0|multiple_choice_grade|0.2857|± |0.0544|
|
81 |
+
|bigbench_hhh_alignment_harmless | 0|multiple_choice_grade|0.3793|± |0.0643|
|
82 |
+
|bigbench_hhh_alignment_helpful | 0|multiple_choice_grade|0.3559|± |0.0629|
|
83 |
+
|bigbench_hhh_alignment_honest | 0|multiple_choice_grade|0.3559|± |0.0629|
|
84 |
+
|bigbench_hhh_alignment_other | 0|multiple_choice_grade|0.5349|± |0.0770|
|
85 |
+
|bigbench_intent_recognition | 0|multiple_choice_grade|0.1546|± |0.0138|
|
86 |
+
|bigbench_misconceptions | 0|multiple_choice_grade|0.5448|± |0.0432|
|
87 |
+
|bigbench_paraphrase | 0|multiple_choice_grade|0.5300|± |0.0354|
|
88 |
+
|bigbench_sentence_ambiguity | 0|multiple_choice_grade|0.4333|± |0.0645|
|
89 |
+
|bigbench_similarities_abstraction | 0|multiple_choice_grade|0.6974|± |0.0530|
|
90 |
|
91 |
# Uploaded model
|
92 |
|