--- base_model: - Qwen/Qwen2.5-14B-Instruct - Qwen/Qwen2.5-14B library_name: transformers tags: - mergekit - merge --- # qwenselfbaseinstruct This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). Re-injected base model into instruct model in the intermediate layers while keeping input and output layers the same (sophosympatheia gradient). While this did degrade the overall score of the model compared to instruct in EQ-bench testing (76.9195 down to 73.8068), it removed its issue with misspelling some of the emotion responses and remains notably higher than the base model (60.1027 but without any syntax errors). It did throw in one non-mispelled "didn't match reference" syntax error, I presume it replaced the emotion entirely or used a similar grammatically correct one. Looking at this as research evidence, it seems like the instruct model picked up something hurting the spelling occasionally specifically in the intermediate layers? I don't know if there's any other gain from this merge compared to using one or both components, this was for curiosity. Might still be useful as more-compact merge materials if you wanted both base and instruct anyway. ## Merge Details ### Merge Method This model was merged using the SLERP merge method. ### Models Merged The following models were included in the merge: * [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) * [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: Qwen/Qwen2.5-14B merge_method: slerp base_model: Qwen/Qwen2.5-14B-Instruct parameters: t: - value: [0, 0, 0.3, 0.4, 0.5, 0.6, 0.5, 0.4, 0.3, 0, 0] dtype: bfloat16 ```