base_model:
- anthracite-org/magnum-v3-9b-customgemma2
- nbeerbower/gemma2-gutenberg-9B
- grimjim/Magnolia-v1-Gemma2-8k-9B
- UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3
- BeaverLegacy/Smegmma-Deluxe-9B-v1
- ifable/gemma-2-Ifable-9B
library_name: transformers
tags:
- mergekit
- merge
temp
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the SLERP method to create an intermediate model. The Model Stock merge method used the SLERP model as a base to mix in more models.
The idea was to make a nice and smart base model and add in a few pinches of spice.
For some reason it wouldn't let me use any other merge method- it gave me ModelReference errors about my intermediary model for every method except Model Stock for some reason. I'll see if I can fix it and upload my intended task-arithmetic version as a v2.
This is the only one of my like 700 merges that I think uses something novel/interesting enough in its creation to merit an upload.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
# THIS YAML CONFIGURATION WAS USED TO CREATE THE INTERMEDIARY MODEL.
# slices:
# - sources:
# - model: anthracite-org/magnum-v3-9b-customgemma2
# layer_range: [0, 42]
# - model: nbeerbower/gemma2-gutenberg-9B
# layer_range: [0, 42]
# merge_method: slerp
# base_model: nbeerbower/gemma2-gutenberg-9B
# parameters:
# t:
# - filter: self_attn
# value: [0.2, 0.5, 0.4, 0.7, 1]
# - filter: mlp
# value: [1, 0.5, 0.3, 0.4, 0.2]
# - value: 0.5
# dtype: float16
# THIS YAML CONFIGURATION WAS USED TO CREATE ASTER. The E: model is the intermediate
# model created in the previous config.
models:
- model: E:/models/mergekit/output/intermediate/
- model: BeaverLegacy/Smegmma-Deluxe-9B-v1
parameters:
weight: 0.3
- model: ifable/gemma-2-Ifable-9B
parameters:
weight: 0.3
- model: UCLA-AGI/Gemma-2-9B-It-SPPO-Iter3
parameters:
weight: 0.15
- model: grimjim/Magnolia-v1-Gemma2-8k-9B
parameters:
weight: 0.25
merge_method: model_stock
base_model: E:/models/mergekit/output/intermediate/
dtype: float16
Alright, now back to smashing models together and seeing what happens...