why two versions in a day?

by supercharge19 - opened Feb 3

Discussion

supercharge19

Feb 3

what is the difference in them? or did you make a mistake training first version?

mlabonne

Owner Feb 4

Both models are trained on the same preference dataset. The only difference is that v1 has a learning rate of 5e-5 and v2 has a learning rate of 5e-7.

Here's the axolotl configuration I used if you're interested: https://gist.github.com/mlabonne/0f781fd9eb47b7d5e4778d285f4a6aee You can find results here: https://huggingface.co/spaces/mlabonne/Yet_Another_LLM_Leaderboard

Stark2008

Jun 16

Hi,

So "NeuralOmniBeagle-7B-v2" is based on mlabonne/OmniBeagle-7B?

Thanks

mlabonne

Owner Jun 16

Yes it applies DPO to this model

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment