leaderboard-pr-bot
commited on
Commit
•
b2f64c0
1
Parent(s):
7ddc381
Adding Evaluation Results
Browse filesThis is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr
The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.
If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions
README.md
CHANGED
@@ -72,4 +72,17 @@ For additional information or inquiries about FinOPT-Lincoln, please contact the
|
|
72 |
FinOPT-Lincoln is an AI language model trained by Maya Philippines. It is provided "as is" without warranty of any kind, express or implied. The model developers and Maya Philippines shall not be liable for any direct or indirect damages arising from the use of this model.
|
73 |
|
74 |
## Acknowledgments
|
75 |
-
The development of FinOPT-Lincoln was made possible by Maya Philippines and the curation and creation of the financial question-answering dataset.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
72 |
FinOPT-Lincoln is an AI language model trained by Maya Philippines. It is provided "as is" without warranty of any kind, express or implied. The model developers and Maya Philippines shall not be liable for any direct or indirect damages arising from the use of this model.
|
73 |
|
74 |
## Acknowledgments
|
75 |
+
The development of FinOPT-Lincoln was made possible by Maya Philippines and the curation and creation of the financial question-answering dataset.
|
76 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
|
77 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MayaPH__FinOPT-Lincoln)
|
78 |
+
|
79 |
+
| Metric | Value |
|
80 |
+
|-----------------------|---------------------------|
|
81 |
+
| Avg. | 25.2 |
|
82 |
+
| ARC (25-shot) | 26.71 |
|
83 |
+
| HellaSwag (10-shot) | 25.6 |
|
84 |
+
| MMLU (5-shot) | 23.0 |
|
85 |
+
| TruthfulQA (0-shot) | 50.59 |
|
86 |
+
| Winogrande (5-shot) | 49.72 |
|
87 |
+
| GSM8K (5-shot) | 0.0 |
|
88 |
+
| DROP (3-shot) | 0.76 |
|