SFT_D1chosenThenDPO_D2a_Instruct_argilla_math_results / model-00003-of-00004.safetensors

Commit History

Trained with Unsloth
11a9cdf
verified

SongTonyLi commited on

Trained with Unsloth
a9c6a03
verified

SongTonyLi commited on