fix NormHead eval
#8
by
kuaizhirui
- opened
I encountered this problem when using Baichuan2-7B-Base with deepspeed stage3 for sft. A similar situation also happened in the place such as https://github.com/baichuan-inc/Baichuan2/issues/39#issuecomment-1710146497
I found that Baichuan2-13B-Chat has solved this problem, so I synced the code here