fine-tuning is needed after self-merging?

#7
by oodgnas - opened

Hi, thank you for the excellent work @mlabonne !

I want to ask whether this model requires fine-tuning steps or not, after its self-merging.
If there is no fine-tuning, it would be really fascinating :)

Thanks,
Sangdoo

Owner

Thanks @oodgnas ! This model hasn't been fine-tuned but this would probably be better (see https://arxiv.org/abs/2312.15166). It looks like small source models really require it while big models can do without but they're kind of insane.

This specific merge ended up exhibiting "sentience" like behaviors, as well as a bit of schizophrenic behaviors.
I imagine that a round of light pretraining and instruct tuning might iron these things out.

Sign up or log in to comment