m-a-p
/

ChatMusician

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

agent404 commited on Feb 27

Commit

e143cea

•

1 Parent(s): 5cd92a8

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -294,7 +294,7 @@ dropout were set to 64, 16, and 0.1, with a batch size of 8.
 ## Evaluation
-1. Music understanding abilities are evaluated on the [MusicTheoryBench](https://huggingface.co/datasets/m-a-p/MusicTheoryBench). Zero-shot accuracy on MusicTheoryBench.
 We included GPT-3.5, GPT-4, LLaMA2-7B-Base, ChatMusician-Base, and ChatMusician. The blue bar represents the performance on the music knowledge metric, and the red bar represents the music reasoning metric. The dashed line corresponds to a random baseline, with a score of 25%.![MusicTheoryBench_result](./MusicTheoryBench_result_plt.png)
 3. General language abilities of ChatMusician are evaluated  on the [Massive Multitask Language Understanding (MMLU) dataset](https://huggingface.co/datasets/lukaemon/mmlu).

 ## Evaluation
+1. Music understanding abilities are evaluated on the [MusicTheoryBench](https://huggingface.co/datasets/m-a-p/MusicTheoryBench). The following figure is zero-shot accuracy on MusicTheoryBench.
 We included GPT-3.5, GPT-4, LLaMA2-7B-Base, ChatMusician-Base, and ChatMusician. The blue bar represents the performance on the music knowledge metric, and the red bar represents the music reasoning metric. The dashed line corresponds to a random baseline, with a score of 25%.![MusicTheoryBench_result](./MusicTheoryBench_result_plt.png)
 3. General language abilities of ChatMusician are evaluated  on the [Massive Multitask Language Understanding (MMLU) dataset](https://huggingface.co/datasets/lukaemon/mmlu).