EVA-UNIT-01/EVA-Qwen2.5-14B-v0.0 · training help and suggestions

Oct 5

Hi there,

I am trying hard reproducing something like what you have done with EVA but in Italian. I don't think, based on my experience, the dataset making process will be a problem but I have some limits on training side, I am not skilled much on that topic. Might I ask you to perhaps share (in a private way if you prefer) the training params and notebook/script or alternatively provide some useful and effective link considering that chatml template is a must for me. I am using Unsloth with LoRA, I was using axolotl but I had several issue with tokens and chatml template. Many thanks in advance and thanks for your hard job

Kearm

EVA-UNIT-01 org Oct 5

So this was a Spectrum top 50% FFT, you can see my PR for Qwen2.5 14B SNR scan. This run was very much an alpha v0.0 run and v0.1 is finishing up training with a much improved methodology and I can do a more detailed writeup on lessons learned between v0.0 and v0.1 and share the Axolotl config if that run turns out as well as the the charts/evals are indicating it will be.

Kearm changed discussion status to closed Oct 5