leaderboard-pt-pr-bot commited on
Commit
bd62fa7
1 Parent(s): 6800f29

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +167 -3
README.md CHANGED
@@ -1,17 +1,164 @@
1
  ---
 
 
 
2
  library_name: transformers
3
  tags:
4
  - Misral
5
  - Portuguese
6
  - 7b
7
- license: apache-2.0
8
  base_model: meta-llama/Llama-2-13b-chat-hf
9
  datasets:
10
  - pablo-moreira/gpt4all-j-prompt-generations-pt
11
  - rhaymison/superset
12
- language:
13
- - pt
14
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  ---
16
 
17
  # Mistral-portuguese-luana-7b
@@ -80,3 +227,20 @@ email: [email protected]
80
  <a href="https://github.com/rhaymisonbetini" target="_blank">
81
  <img src="https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white">
82
  </a>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - pt
4
+ license: apache-2.0
5
  library_name: transformers
6
  tags:
7
  - Misral
8
  - Portuguese
9
  - 7b
 
10
  base_model: meta-llama/Llama-2-13b-chat-hf
11
  datasets:
12
  - pablo-moreira/gpt4all-j-prompt-generations-pt
13
  - rhaymison/superset
 
 
14
  pipeline_tag: text-generation
15
+ model-index:
16
+ - name: Llama-portuguese-13b-Luana-v0.2
17
+ results:
18
+ - task:
19
+ type: text-generation
20
+ name: Text Generation
21
+ dataset:
22
+ name: ENEM Challenge (No Images)
23
+ type: eduagarcia/enem_challenge
24
+ split: train
25
+ args:
26
+ num_few_shot: 3
27
+ metrics:
28
+ - type: acc
29
+ value: 36.95
30
+ name: accuracy
31
+ source:
32
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Llama-portuguese-13b-Luana-v0.2
33
+ name: Open Portuguese LLM Leaderboard
34
+ - task:
35
+ type: text-generation
36
+ name: Text Generation
37
+ dataset:
38
+ name: BLUEX (No Images)
39
+ type: eduagarcia-temp/BLUEX_without_images
40
+ split: train
41
+ args:
42
+ num_few_shot: 3
43
+ metrics:
44
+ - type: acc
45
+ value: 32.68
46
+ name: accuracy
47
+ source:
48
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Llama-portuguese-13b-Luana-v0.2
49
+ name: Open Portuguese LLM Leaderboard
50
+ - task:
51
+ type: text-generation
52
+ name: Text Generation
53
+ dataset:
54
+ name: OAB Exams
55
+ type: eduagarcia/oab_exams
56
+ split: train
57
+ args:
58
+ num_few_shot: 3
59
+ metrics:
60
+ - type: acc
61
+ value: 33.3
62
+ name: accuracy
63
+ source:
64
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Llama-portuguese-13b-Luana-v0.2
65
+ name: Open Portuguese LLM Leaderboard
66
+ - task:
67
+ type: text-generation
68
+ name: Text Generation
69
+ dataset:
70
+ name: Assin2 RTE
71
+ type: assin2
72
+ split: test
73
+ args:
74
+ num_few_shot: 15
75
+ metrics:
76
+ - type: f1_macro
77
+ value: 65.83
78
+ name: f1-macro
79
+ source:
80
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Llama-portuguese-13b-Luana-v0.2
81
+ name: Open Portuguese LLM Leaderboard
82
+ - task:
83
+ type: text-generation
84
+ name: Text Generation
85
+ dataset:
86
+ name: Assin2 STS
87
+ type: eduagarcia/portuguese_benchmark
88
+ split: test
89
+ args:
90
+ num_few_shot: 15
91
+ metrics:
92
+ - type: pearson
93
+ value: 42.81
94
+ name: pearson
95
+ source:
96
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Llama-portuguese-13b-Luana-v0.2
97
+ name: Open Portuguese LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: FaQuAD NLI
103
+ type: ruanchaves/faquad-nli
104
+ split: test
105
+ args:
106
+ num_few_shot: 15
107
+ metrics:
108
+ - type: f1_macro
109
+ value: 40.44
110
+ name: f1-macro
111
+ source:
112
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Llama-portuguese-13b-Luana-v0.2
113
+ name: Open Portuguese LLM Leaderboard
114
+ - task:
115
+ type: text-generation
116
+ name: Text Generation
117
+ dataset:
118
+ name: HateBR Binary
119
+ type: ruanchaves/hatebr
120
+ split: test
121
+ args:
122
+ num_few_shot: 25
123
+ metrics:
124
+ - type: f1_macro
125
+ value: 83.62
126
+ name: f1-macro
127
+ source:
128
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Llama-portuguese-13b-Luana-v0.2
129
+ name: Open Portuguese LLM Leaderboard
130
+ - task:
131
+ type: text-generation
132
+ name: Text Generation
133
+ dataset:
134
+ name: PT Hate Speech Binary
135
+ type: hate_speech_portuguese
136
+ split: test
137
+ args:
138
+ num_few_shot: 25
139
+ metrics:
140
+ - type: f1_macro
141
+ value: 54.62
142
+ name: f1-macro
143
+ source:
144
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Llama-portuguese-13b-Luana-v0.2
145
+ name: Open Portuguese LLM Leaderboard
146
+ - task:
147
+ type: text-generation
148
+ name: Text Generation
149
+ dataset:
150
+ name: tweetSentBR
151
+ type: eduagarcia-temp/tweetsentbr
152
+ split: test
153
+ args:
154
+ num_few_shot: 25
155
+ metrics:
156
+ - type: f1_macro
157
+ value: 49.25
158
+ name: f1-macro
159
+ source:
160
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=rhaymison/Llama-portuguese-13b-Luana-v0.2
161
+ name: Open Portuguese LLM Leaderboard
162
  ---
163
 
164
  # Mistral-portuguese-luana-7b
 
227
  <a href="https://github.com/rhaymisonbetini" target="_blank">
228
  <img src="https://img.shields.io/badge/GitHub-100000?style=for-the-badge&logo=github&logoColor=white">
229
  </a>
230
+
231
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
232
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/rhaymison/Llama-portuguese-13b-Luana-v0.2)
233
+
234
+ | Metric | Value |
235
+ |--------------------------|---------|
236
+ |Average |**48.83**|
237
+ |ENEM Challenge (No Images)| 36.95|
238
+ |BLUEX (No Images) | 32.68|
239
+ |OAB Exams | 33.30|
240
+ |Assin2 RTE | 65.83|
241
+ |Assin2 STS | 42.81|
242
+ |FaQuAD NLI | 40.44|
243
+ |HateBR Binary | 83.62|
244
+ |PT Hate Speech Binary | 54.62|
245
+ |tweetSentBR | 49.25|
246
+