Merge pull request #15 from miragecoa/main
Browse filesUpdate README.md with contribution guidelines.
- README.md +44 -25
- src/about.py +6 -8
README.md
CHANGED
@@ -37,31 +37,50 @@ OFLL provides a specialized evaluation framework tailored specifically to the fi
|
|
37 |
The Open Financial LLM Leaderboard aims to set a new standard in evaluating the capabilities of language models in the financial domain, offering a specialized, real-world-focused benchmarking solution.
|
38 |
|
39 |
|
40 |
-
#
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
If you encounter problem on the space, don't hesitate to restart it to remove the create eval-queue, eval-queue-bk, eval-results and eval-results-bk created folder.
|
66 |
|
67 |
# Code logic for more complex edits
|
|
|
37 |
The Open Financial LLM Leaderboard aims to set a new standard in evaluating the capabilities of language models in the financial domain, offering a specialized, real-world-focused benchmarking solution.
|
38 |
|
39 |
|
40 |
+
# Contribute to OFLL
|
41 |
+
|
42 |
+
To make the leaderboard more accessible for external contributors, we offer clear guidelines for adding tasks, updating result files, and other maintenance activities.
|
43 |
+
|
44 |
+
1. **Primary Files**:
|
45 |
+
- `src/env.py`: Modify variables like repository paths for customization.
|
46 |
+
- `src/about.py`: Update task configurations here to add new datasets.
|
47 |
+
|
48 |
+
2. **Adding New Tasks**:
|
49 |
+
- Navigate to `src/about.py` and specify new tasks in the `Tasks` enum section.
|
50 |
+
- Each task requires details such as `benchmark`, `metric`, `col_name`, and `category`. For example:
|
51 |
+
```python
|
52 |
+
taskX = Task("DatasetName", "MetricType", "ColumnName", category="Category")
|
53 |
+
```
|
54 |
+
|
55 |
+
3. **Updating Results Files**:
|
56 |
+
- Results files should be in JSON format and structured as follows:
|
57 |
+
```json
|
58 |
+
{
|
59 |
+
"config": {
|
60 |
+
"model_dtype": "torch.float16",
|
61 |
+
"model_name": "path of the model on the hub: org/model",
|
62 |
+
"model_sha": "revision on the hub"
|
63 |
+
},
|
64 |
+
"results": {
|
65 |
+
"task_name": {
|
66 |
+
"metric_name": score
|
67 |
+
},
|
68 |
+
"task_name2": {
|
69 |
+
"metric_name": score
|
70 |
+
}
|
71 |
+
}
|
72 |
+
}
|
73 |
+
```
|
74 |
+
|
75 |
+
4. **Updating Leaderboard Data**:
|
76 |
+
- When a new task is added, ensure that the results JSON files reflect this update. This process will be automated in future releases.
|
77 |
+
- Access the current results at [Hugging Face Datasets](https://huggingface.co/datasets/TheFinAI/results/tree/main/demo-leaderboard).
|
78 |
+
|
79 |
+
5. **Useful Links**:
|
80 |
+
- [Hugging Face Leaderboard Documentation](https://huggingface.co/docs/leaderboards/en/leaderboards/building_page)
|
81 |
+
- [OFLL Demo on Hugging Face](https://huggingface.co/spaces/finosfoundation/Open-Financial-LLM-Leaderboard)
|
82 |
+
|
83 |
+
|
84 |
If you encounter problem on the space, don't hesitate to restart it to remove the create eval-queue, eval-queue-bk, eval-results and eval-results-bk created folder.
|
85 |
|
86 |
# Code logic for more complex edits
|
src/about.py
CHANGED
@@ -194,12 +194,10 @@ If everything is done, check you can launch the EleutherAIHarness on your model
|
|
194 |
|
195 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
196 |
CITATION_BUTTON_TEXT = r"""
|
197 |
-
@
|
198 |
-
|
199 |
-
|
200 |
-
|
201 |
-
|
202 |
-
|
203 |
-
primaryClass={cs.CL}
|
204 |
-
}
|
205 |
"""
|
|
|
194 |
|
195 |
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
|
196 |
CITATION_BUTTON_TEXT = r"""
|
197 |
+
@article{Xie2024FinBen,
|
198 |
+
title={FinBen: A Holistic Financial Benchmark for Large Language Models},
|
199 |
+
author={Qianqian Xie and Weiguang Han and Zhengyu Chen and Ruoyu Xiang and Xiao Zhang and Yueru He and Mengxi Xiao and Dong Li and Yongfu Dai and Duanyu Feng and Yijing Xu and Haoqiang Kang and Ziyan Kuang and Chenhan Yuan and Kailai Yang and Zheheng Luo and Tianlin Zhang and Zhiwei Liu and Guojun Xiong and Zhiyang Deng and Yuechen Jiang and Zhiyuan Yao and Haohang Li and Yangyang Yu and Gang Hu and Jiajia Huang and Xiao-Yang Liu and Alejandro Lopez-Lira and Benyou Wang and Yanzhao Lai and Hao Wang and Min Peng and Sophia Ananiadou and Jimin Huang},
|
200 |
+
journal={NeurIPS, Special Track on Datasets and Benchmarks},
|
201 |
+
year={2024},
|
202 |
+
}
|
|
|
|
|
203 |
"""
|