Update README.md
Browse files
README.md
CHANGED
@@ -32,7 +32,7 @@ It uses Alpaca format but with a translated instruction at the start:
|
|
32 |
|
33 |
| Model | Size | Type | FIN-bench (score) |
|
34 |
|-------|------|------|-------|
|
35 |
-
| **mpasila/Finnish-Alpaca-Small-7B** | 7B | Instruct |
|
36 |
| [mpasila/Finnish-Alpaca-Tiny-V2-7B](https://huggingface.co/mpasila/Finnish-Alpaca-Tiny-V2-7B) | 7B | Instruct | 0.4654 |
|
37 |
| [mpasila/Alpacazord-Viking-7B](https://huggingface.co/mpasila/Alpacazord-Viking-7B) | 7B | Instruct | 0.4123 |
|
38 |
| [mpasila/NordicAlpaca-Finnish-V1-7B](https://huggingface.co/mpasila/NordicAlpaca-Finnish-V1-7B) | 7B | Instruct | 0.3891 |
|
@@ -47,44 +47,6 @@ It uses Alpaca format but with a translated instruction at the start:
|
|
47 |
|
48 |
#### FIN-bench scores:
|
49 |
|
50 |
-
| Task |Version| Metric |Value | |Stderr|
|
51 |
-
|------------------------------------------------|------:|---------------------|-----:|---|-----:|
|
52 |
-
|bigbench_analogies | 0|multiple_choice_grade|0.5308|± |0.0439|
|
53 |
-
|bigbench_arithmetic_1_digit_addition | 0|multiple_choice_grade|0.5000|± |0.0503|
|
54 |
-
|bigbench_arithmetic_1_digit_division | 0|multiple_choice_grade|0.8261|± |0.0808|
|
55 |
-
|bigbench_arithmetic_1_digit_multiplication | 0|multiple_choice_grade|0.4600|± |0.0501|
|
56 |
-
|bigbench_arithmetic_1_digit_subtraction | 0|multiple_choice_grade|0.6000|± |0.0492|
|
57 |
-
|bigbench_arithmetic_2_digit_addition | 0|multiple_choice_grade|0.3800|± |0.0488|
|
58 |
-
|bigbench_arithmetic_2_digit_division | 0|multiple_choice_grade|0.5200|± |0.0502|
|
59 |
-
|bigbench_arithmetic_2_digit_multiplication | 0|multiple_choice_grade|0.2800|± |0.0451|
|
60 |
-
|bigbench_arithmetic_2_digit_subtraction | 0|multiple_choice_grade|0.5100|± |0.0502|
|
61 |
-
|bigbench_arithmetic_3_digit_addition | 0|multiple_choice_grade|0.5600|± |0.0499|
|
62 |
-
|bigbench_arithmetic_3_digit_division | 0|multiple_choice_grade|0.3800|± |0.0488|
|
63 |
-
|bigbench_arithmetic_3_digit_multiplication | 0|multiple_choice_grade|0.2700|± |0.0446|
|
64 |
-
|bigbench_arithmetic_3_digit_subtraction | 0|multiple_choice_grade|0.5400|± |0.0501|
|
65 |
-
|bigbench_arithmetic_4_digit_addition | 0|multiple_choice_grade|0.5400|± |0.0501|
|
66 |
-
|bigbench_arithmetic_4_digit_division | 0|multiple_choice_grade|0.4000|± |0.0492|
|
67 |
-
|bigbench_arithmetic_4_digit_multiplication | 0|multiple_choice_grade|0.3300|± |0.0473|
|
68 |
-
|bigbench_arithmetic_4_digit_subtraction | 0|multiple_choice_grade|0.6100|± |0.0490|
|
69 |
-
|bigbench_arithmetic_5_digit_addition | 0|multiple_choice_grade|0.6500|± |0.0479|
|
70 |
-
|bigbench_arithmetic_5_digit_division | 0|multiple_choice_grade|0.3300|± |0.0473|
|
71 |
-
|bigbench_arithmetic_5_digit_multiplication | 0|multiple_choice_grade|0.3200|± |0.0469|
|
72 |
-
|bigbench_arithmetic_5_digit_subtraction | 0|multiple_choice_grade|0.6500|± |0.0479|
|
73 |
-
|bigbench_cause_and_effect_one_sentence | 0|multiple_choice_grade|0.5490|± |0.0704|
|
74 |
-
|bigbench_cause_and_effect_one_sentence_no_prompt| 0|multiple_choice_grade|0.6471|± |0.0676|
|
75 |
-
|bigbench_cause_and_effect_two_sentences | 0|multiple_choice_grade|0.4314|± |0.0700|
|
76 |
-
|bigbench_emotions | 0|multiple_choice_grade|0.3500|± |0.0378|
|
77 |
-
|bigbench_empirical_judgments | 0|multiple_choice_grade|0.3131|± |0.0468|
|
78 |
-
|bigbench_general_knowledge | 0|multiple_choice_grade|0.2429|± |0.0516|
|
79 |
-
|bigbench_hhh_alignment_harmless | 0|multiple_choice_grade|0.3793|± |0.0643|
|
80 |
-
|bigbench_hhh_alignment_helpful | 0|multiple_choice_grade|0.3559|± |0.0629|
|
81 |
-
|bigbench_hhh_alignment_honest | 0|multiple_choice_grade|0.3559|± |0.0629|
|
82 |
-
|bigbench_hhh_alignment_other | 0|multiple_choice_grade|0.5581|± |0.0766|
|
83 |
-
|bigbench_intent_recognition | 0|multiple_choice_grade|0.2240|± |0.0159|
|
84 |
-
|bigbench_misconceptions | 0|multiple_choice_grade|0.5373|± |0.0432|
|
85 |
-
|bigbench_paraphrase | 0|multiple_choice_grade|0.5000|± |0.0354|
|
86 |
-
|bigbench_sentence_ambiguity | 0|multiple_choice_grade|0.4833|± |0.0651|
|
87 |
-
|bigbench_similarities_abstraction | 0|multiple_choice_grade|0.7237|± |0.0516|
|
88 |
|
89 |
# Uploaded model
|
90 |
|
|
|
32 |
|
33 |
| Model | Size | Type | FIN-bench (score) |
|
34 |
|-------|------|------|-------|
|
35 |
+
| **mpasila/Finnish-Alpaca-Small-7B** | 7B | Instruct | |
|
36 |
| [mpasila/Finnish-Alpaca-Tiny-V2-7B](https://huggingface.co/mpasila/Finnish-Alpaca-Tiny-V2-7B) | 7B | Instruct | 0.4654 |
|
37 |
| [mpasila/Alpacazord-Viking-7B](https://huggingface.co/mpasila/Alpacazord-Viking-7B) | 7B | Instruct | 0.4123 |
|
38 |
| [mpasila/NordicAlpaca-Finnish-V1-7B](https://huggingface.co/mpasila/NordicAlpaca-Finnish-V1-7B) | 7B | Instruct | 0.3891 |
|
|
|
47 |
|
48 |
#### FIN-bench scores:
|
49 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
50 |
|
51 |
# Uploaded model
|
52 |
|