This is a finetuned version of mistralai/Mistral-7B-Instruct-v0.2 using unsloth on a instruct portuguese dataset, as an attempt to improve the performance of the model on the language.
No benchmarks have been executed yet.
The original prompt format was used:
<s>[INST] {Prompt goes here} [/INST]
Open Portuguese LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Average | 64.7 |
ENEM Challenge (No Images) | 58.08 |
BLUEX (No Images) | 48.68 |
OAB Exams | 37.08 |
Assin2 RTE | 90.31 |
Assin2 STS | 76.55 |
FaQuAD NLI | 58.84 |
HateBR Binary | 79.21 |
PT Hate Speech Binary | 68.87 |
tweetSentBR | 64.71 |
- Downloads last month
- 53
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Dataset used to train cnmoro/Mistral-7B-Portuguese
Space using cnmoro/Mistral-7B-Portuguese 1
Evaluation results
- accuracy on ENEM Challenge (No Images)Open Portuguese LLM Leaderboard58.080
- accuracy on BLUEX (No Images)Open Portuguese LLM Leaderboard48.680
- accuracy on OAB ExamsOpen Portuguese LLM Leaderboard37.080
- f1-macro on Assin2 RTEtest set Open Portuguese LLM Leaderboard90.310
- pearson on Assin2 STStest set Open Portuguese LLM Leaderboard76.550
- f1-macro on FaQuAD NLItest set Open Portuguese LLM Leaderboard58.840
- f1-macro on HateBR Binarytest set Open Portuguese LLM Leaderboard79.210
- f1-macro on PT Hate Speech Binarytest set Open Portuguese LLM Leaderboard68.870
- f1-macro on tweetSentBRtest set Open Portuguese LLM Leaderboard64.710