--- base_model: grimjim/kukulemon-v3-soul_mix-32k-7B library_name: transformers quanted_by: grimjim license: cc-by-nc-4.0 pipeline_tag: text-generation --- # kukulemon-v3-soul_mix-32k-7B This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). We explore merger at extremely low weight as an alternative to fine-tuning. The additional model was applied at a weight of 10e-5, which was selected to be comparable to a few epochs of training. The low weight also amounts to the additional model being flattened, though technically not sparsified. - [Full weights](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B) - [GGUF quants](https://huggingface.co/grimjim/kukulemon-v3-soul_mix-32k-7B-GGUF) ## Merge Details ### Merge Method This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [grimjim/kukulemon-32K-7B](https://huggingface.co/grimjim/kukulemon-32K-7B) as a base. ### Models Merged The following model was included in the merge: * [grimjim/rogue-enchantress-32k-7B](https://huggingface.co/grimjim/rogue-enchantress-32k-7B) ### Configuration The following YAML configuration was used to produce this model: ```yaml base_model: grimjim/kukulemon-32K-7B dtype: bfloat16 merge_method: task_arithmetic slices: - sources: - layer_range: [0, 32] model: grimjim/kukulemon-32K-7B - layer_range: [0, 32] model: grimjim/rogue-enchantress-32k-7B parameters: weight: 10e-5 ```