405B version
#5
by
nonetrix
- opened
Would be nice if you made 405B version with same data or training etc. I think it would be quite good more than likely
The model looks good. I tried to evaluate it on standard benchmarks, its head to head with llama-3.1-70b model on various benchmarks like BBH, MMLU Pro etc and regressing a bit on couple of datasets. I was expecting numbers bigger than that of llama-3.1-70b. Do you guys have evals numbers on these standard benchmarks so we can compare with those to make sure I am not doing any mistake in my evals. Thanks NVIDIA team.