Explain the rationale for your density values

#1
by Joseph717171 - opened

Nice to see that the TIES merge of the instruct model with the base is catching on. However, I noticed that you use slightly different values for some of your densities. Why are some different? Why aren’t they all one? I’m curious because I am currently exploring this merge as well. 😋

Also: I find it odd that you didn't replace the merge's .json config files (excluding 'model.safetensors.index.json') with the instruct's. Or perhaps you replaced them with arcee-ai/Llama-3.1-SuperNova-Lite's. But, I would think that you would want Hermes-3's for its jinja template?

Regardless, I look forward to discussing this more with you. 🤓

Sign up or log in to comment