The result for Llama 2 13B's GSM-8K (8-shot, CoT) is 77.4, which seems incorrect.
#19
by
Hi-archer
- opened
The result for Llama 2 13B's GSM-8K (8-shot, CoT) is 77.4, which seems incorrect.
It seems that there is indeed a problem. The performance of LLaMA2 70B is much lower than that of LLaMA2 13B. It should be that they are written backwards?
Not exactly?The following is the result reported in the LLAMA2 paper.
Llama 2-7B 14.6
Llama 2-13B 28.7
Llama 2-34B 42.2
Llama 2-70B 56.8