FFN is not the same as https://huggingface.co/meta-llama/Llama-2-7b-hf
#5
by
Muennighoff
- opened
I am seeing tiny differences in the FFN hence the performance does also not match exactly w/ meta-llama/Llama-2-7b-hf
I am seeing tiny differences in the FFN hence the performance does also not match exactly w/ meta-llama/Llama-2-7b-hf
Whats the cause?
I just checked the checkpoint hashes and they all seem to match. The meta-llama repo updated the pytorch binaries with FP16 variants, but I matched the ones here (FP32) against the older commits:
PyTorch bins
ckpt 1 - Nous / ckpt 1 - meta
ckpt 2 - Nous / ckpt 2 - meta
ckpt 3 - Nous / ckpt 3 - meta
Safetensors
ckpt 1 - Nous / ckpt 1 - meta
ckpt 2 - Nous / ckpt 2 - meta
I'd perhaps recommend swapping the misc files (such as the JSON files) with the official ones in the meta repo.