leaderboard-pt-pr-bot commited on
Commit
8c082fc
1 Parent(s): 5679cd8

Adding the Open Portuguese LLM Leaderboard Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard

The purpose of this PR is to add evaluation results from the Open Portuguese LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/eduagarcia-temp/portuguese-leaderboard-results-to-modelcard/discussions

Files changed (1) hide show
  1. README.md +138 -0
README.md CHANGED
@@ -1,6 +1,127 @@
1
  ---
2
  language:
3
  - pt
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
  Sabiá-7B is Portuguese language model developed by [Maritaca AI](https://www.maritaca.ai/).
@@ -117,3 +238,20 @@ Please use the following bibtex to cite our paper:
117
  isbn="978-3-031-45392-2"
118
  }
119
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  language:
3
  - pt
4
+ model-index:
5
+ - name: sabia-7b
6
+ results:
7
+ - task:
8
+ type: text-generation
9
+ name: Text Generation
10
+ dataset:
11
+ name: ENEM Challenge (No Images)
12
+ type: eduagarcia/enem_challenge
13
+ split: train
14
+ args:
15
+ num_few_shot: 3
16
+ metrics:
17
+ - type: acc
18
+ value: 55.07
19
+ name: accuracy
20
+ source:
21
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=maritaca-ai/sabia-7b
22
+ name: Open Portuguese LLM Leaderboard
23
+ - task:
24
+ type: text-generation
25
+ name: Text Generation
26
+ dataset:
27
+ name: BLUEX (No Images)
28
+ type: eduagarcia-temp/BLUEX_without_images
29
+ split: train
30
+ args:
31
+ num_few_shot: 3
32
+ metrics:
33
+ - type: acc
34
+ value: 47.71
35
+ name: accuracy
36
+ source:
37
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=maritaca-ai/sabia-7b
38
+ name: Open Portuguese LLM Leaderboard
39
+ - task:
40
+ type: text-generation
41
+ name: Text Generation
42
+ dataset:
43
+ name: OAB Exams
44
+ type: eduagarcia/oab_exams
45
+ split: train
46
+ args:
47
+ num_few_shot: 3
48
+ metrics:
49
+ - type: acc
50
+ value: 41.41
51
+ name: accuracy
52
+ source:
53
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=maritaca-ai/sabia-7b
54
+ name: Open Portuguese LLM Leaderboard
55
+ - task:
56
+ type: text-generation
57
+ name: Text Generation
58
+ dataset:
59
+ name: Assin2 RTE
60
+ type: assin2
61
+ split: test
62
+ args:
63
+ num_few_shot: 15
64
+ metrics:
65
+ - type: f1_macro
66
+ value: 46.68
67
+ name: f1-macro
68
+ - type: pearson
69
+ value: 1.89
70
+ name: pearson
71
+ source:
72
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=maritaca-ai/sabia-7b
73
+ name: Open Portuguese LLM Leaderboard
74
+ - task:
75
+ type: text-generation
76
+ name: Text Generation
77
+ dataset:
78
+ name: FaQuAD NLI
79
+ type: ruanchaves/faquad-nli
80
+ split: test
81
+ args:
82
+ num_few_shot: 15
83
+ metrics:
84
+ - type: f1_macro
85
+ value: 58.34
86
+ name: f1-macro
87
+ source:
88
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=maritaca-ai/sabia-7b
89
+ name: Open Portuguese LLM Leaderboard
90
+ - task:
91
+ type: text-generation
92
+ name: Text Generation
93
+ dataset:
94
+ name: HateBR Binary
95
+ type: eduagarcia/portuguese_benchmark
96
+ split: test
97
+ args:
98
+ num_few_shot: 25
99
+ metrics:
100
+ - type: f1_macro
101
+ value: 61.93
102
+ name: f1-macro
103
+ - type: f1_macro
104
+ value: 64.13
105
+ name: f1-macro
106
+ source:
107
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=maritaca-ai/sabia-7b
108
+ name: Open Portuguese LLM Leaderboard
109
+ - task:
110
+ type: text-generation
111
+ name: Text Generation
112
+ dataset:
113
+ name: tweetSentBR
114
+ type: eduagarcia-temp/tweetsentbr
115
+ split: test
116
+ args:
117
+ num_few_shot: 25
118
+ metrics:
119
+ - type: f1_macro
120
+ value: 46.64
121
+ name: f1-macro
122
+ source:
123
+ url: https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=maritaca-ai/sabia-7b
124
+ name: Open Portuguese LLM Leaderboard
125
  ---
126
 
127
  Sabiá-7B is Portuguese language model developed by [Maritaca AI](https://www.maritaca.ai/).
 
238
  isbn="978-3-031-45392-2"
239
  }
240
  ```
241
+
242
+ # [Open Portuguese LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard)
243
+ Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/maritaca-ai/sabia-7b)
244
+
245
+ | Metric | Value |
246
+ |--------------------------|---------|
247
+ |Average |**47.09**|
248
+ |ENEM Challenge (No Images)| 55.07|
249
+ |BLUEX (No Images) | 47.71|
250
+ |OAB Exams | 41.41|
251
+ |Assin2 RTE | 46.68|
252
+ |Assin2 STS | 1.89|
253
+ |FaQuAD NLI | 58.34|
254
+ |HateBR Binary | 61.93|
255
+ |PT Hate Speech Binary | 64.13|
256
+ |tweetSentBR | 46.64|
257
+