fedric95 commited on
Commit
c698b87
1 Parent(s): 4127b2b

Upload ./Qwen2-7B-Q3_K_S.mmlu.pro.txt with huggingface_hub

Browse files
Files changed (1) hide show
  1. Qwen2-7B-Q3_K_S.mmlu.pro.txt +2 -80
Qwen2-7B-Q3_K_S.mmlu.pro.txt CHANGED
@@ -1,80 +1,2 @@
1
- multiple_choice_score: there are 70 tasks in prompt
2
- multiple_choice_score: reading tasks......................................................................done
3
- multiple_choice_score: preparing task data......................................................................done
4
- multiple_choice_score : calculating TruthfulQA score over 70 tasks.
5
-
6
- task acc_norm
7
- 1 0.00000000
8
- 2 0.00000000
9
- 3 0.00000000
10
- 4 0.00000000
11
- 5 20.00000000
12
- 6 33.33333333
13
- 7 28.57142857
14
- 8 25.00000000
15
- 9 22.22222222
16
- 10 30.00000000
17
- 11 27.27272727
18
- 12 25.00000000
19
- 13 23.07692308
20
- 14 28.57142857
21
- 15 33.33333333
22
- 16 31.25000000
23
- 17 29.41176471
24
- 18 27.77777778
25
- 19 26.31578947
26
- 20 25.00000000
27
- 21 23.80952381
28
- 22 22.72727273
29
- 23 21.73913043
30
- 24 20.83333333
31
- 25 20.00000000
32
- 26 19.23076923
33
- 27 18.51851852
34
- 28 17.85714286
35
- 29 17.24137931
36
- 30 20.00000000
37
- 31 19.35483871
38
- 32 18.75000000
39
- 33 21.21212121
40
- 34 20.58823529
41
- 35 20.00000000
42
- 36 19.44444444
43
- 37 18.91891892
44
- 38 18.42105263
45
- 39 17.94871795
46
- 40 20.00000000
47
- 41 19.51219512
48
- 42 19.04761905
49
- 43 18.60465116
50
- 44 18.18181818
51
- 45 17.77777778
52
- 46 19.56521739
53
- 47 19.14893617
54
- 48 20.83333333
55
- 49 20.40816327
56
- 50 20.00000000
57
- 51 19.60784314
58
- 52 19.23076923
59
- 53 20.75471698
60
- 54 20.37037037
61
- 55 21.81818182
62
- 56 21.42857143
63
- 57 21.05263158
64
- 58 20.68965517
65
- 59 20.33898305
66
- 60 20.00000000
67
- 61 19.67213115
68
- 62 19.35483871
69
- 63 19.04761905
70
- 64 20.31250000
71
- 65 21.53846154
72
- 66 21.21212121
73
- 67 20.89552239
74
- 68 20.58823529
75
- 69 20.28985507
76
- 70 20.00000000
77
-
78
- Final result: 20.0000 +/- 4.8154
79
- Random chance: 10.0000 +/- 3.6116
80
-
 
1
+ multiple_choice_score: there are 12032 tasks in prompt
2
+ multiple_choice_score: reading tasksmultiple_choice_score: failed to read task 1 of 12032