About Moe vocab extended model with non vocab extended model
#3
by
ancv
- opened
Hi
@mlabonne
,
Thank for your great model. Btw, I have a specific question regarding the Moe model. I have a vocab extended Mistral 7B to be better for Vietnamese language, I want to Moe it with chat, code and math based on Mistral 7B to enhance model's capabilities. Is it possible? If the model has some differences in token ids between extended and non extended, then after merging I fintune with an amount of data (about 1B tokens), will the model be better?