why two versions in a day?

#1
by supercharge19 - opened

what is the difference in them? or did you make a mistake training first version?

Owner

Both models are trained on the same preference dataset. The only difference is that v1 has a learning rate of 5e-5 and v2 has a learning rate of 5e-7.

Here's the axolotl configuration I used if you're interested: https://gist.github.com/mlabonne/0f781fd9eb47b7d5e4778d285f4a6aee You can find results here: https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard

Hi,

So "NeuralOmniBeagle-7B-v2" is based on mlabonne/OmniBeagle-7B?

Thanks

Yes it applies DPO to this model

Sign up or log in to comment