Update README.md
Browse files
README.md
CHANGED
@@ -201,8 +201,10 @@ and music knowledge & music summary data) from MusicPile. Check our [paper](http
|
|
201 |
## Evaluation
|
202 |
|
203 |
1. Music understanding abilities are evaluated on the [MusicTheoryBench](https://huggingface.co/datasets/m-a-p/MusicTheoryBench). The following figure is zero-shot accuracy on MusicTheoryBench.
|
204 |
-
We included GPT-3.5, GPT-4, LLaMA2-7B-Base, ChatMusician-Base, and ChatMusician. The blue bar represents the performance on the music knowledge metric, and the red bar represents the music reasoning metric. The dashed line corresponds to a random baseline, with a score of 25
|
205 |
-
|
|
|
|
|
206 |
|
207 |
|
208 |
|
|
|
201 |
## Evaluation
|
202 |
|
203 |
1. Music understanding abilities are evaluated on the [MusicTheoryBench](https://huggingface.co/datasets/m-a-p/MusicTheoryBench). The following figure is zero-shot accuracy on MusicTheoryBench.
|
204 |
+
We included GPT-3.5, GPT-4, LLaMA2-7B-Base, ChatMusician-Base, and ChatMusician. The blue bar represents the performance on the music knowledge metric, and the red bar represents the music reasoning metric. The dashed line corresponds to a random baseline, with a score of 25%.
|
205 |
+
<!-- ![MusicTheoryBench_result](./MusicTheoryBench_result_plt.png) -->
|
206 |
+
<img src="./MusicTheoryBench_result_plt.png" alt="drawing" width="200"/>
|
207 |
+
3. General language abilities of ChatMusician are evaluated on the [Massive Multitask Language Understanding (MMLU) dataset](https://huggingface.co/datasets/lukaemon/mmlu).
|
208 |
|
209 |
|
210 |
|