|
--- |
|
base_model: |
|
- matchaaaaa/Honey-Yuzu-13B |
|
tags: |
|
- merge |
|
- mergekit |
|
--- |
|
|
|
I really like Honey Yuzu but I found Fimbulvetr v2 performs much better than v2.1 so I wanted to remake the merge with it, and I messed around with the gradient merge methods a bit too. Although I haven't tested it extensively, this version consistently scores a pointer higher than the original on a local EQ-Bench (Q4KL quant). It ties Lyra-v1, the new Nemo finetune that scored highest! |
|
|
|
MERGE BROUGHT TO YOU BY: |
|
- Mergekit Config Stolen From matchaaaaa/Honey-Yuzu-13B |
|
- Trying to use lazymergekit at first except I had to throw everything out anyway and do the multi-step merge by hand |
|
|
|
```yaml |
|
slices: # this is a quick float32 restack of BLC using the OG recipe |
|
- sources: |
|
- model: SanjiWatsuki/Kunoichi-7B |
|
layer_range: [0, 24] |
|
- sources: |
|
- model: SanjiWatsuki/Silicon-Maid-7B |
|
layer_range: [8, 24] |
|
- sources: |
|
- model: KatyTheCutie/LemonadeRP-4.5.3 |
|
layer_range: [24, 32] |
|
merge_method: passthrough |
|
dtype: float32 |
|
name: Big-Lemon-Cookie-11B |
|
--- |
|
models: # this is a remake of CLC with the newer Fimbul v2.1 version |
|
- model: Big-Lemon-Cookie-11B |
|
parameters: |
|
weight: 0.8 |
|
- model: Sao10K/Fimbulvetr-11B-v2 # Fim 2.1 performs significantly worse imo, and we don't care about 16k |
|
parameters: |
|
weight: 0.2 |
|
merge_method: linear |
|
dtype: float32 |
|
name: Chunky-Lemon-Cookie-11B |
|
--- |
|
slices: # 8 layers of WL for the splice |
|
- sources: |
|
- model: senseable/WestLake-7B-v2 |
|
layer_range: [8, 16] |
|
merge_method: passthrough |
|
dtype: float32 |
|
name: WL-splice |
|
--- |
|
slices: # 8 layers of CLC for the splice |
|
- sources: |
|
- model: Chunky-Lemon-Cookie-11B |
|
layer_range: [8, 16] |
|
merge_method: passthrough |
|
dtype: float32 |
|
name: CLC-splice |
|
--- |
|
models: # this is the splice, a gradient merge meant to gradually and smoothly interpolate between stacks of different models |
|
- model: WL-splice |
|
parameters: |
|
weight: [1, 1, 0.75, 0.625, 0.5, 0.375, 0.25, 0, 0] # 0.125 / 0.875 values removed here - "math gets screwy" |
|
- model: CLC-splice |
|
parameters: |
|
weight: [0, 0, 0.25, 0.375, 0.5, 0.625, 0.75, 1, 1] # 0.125 / 0.875 values removed here - "math gets screwy" |
|
merge_method: della_linear # New Meme |
|
base_model: WL-splice |
|
dtype: float32 |
|
name: splice |
|
--- |
|
slices: # putting it all together |
|
- sources: |
|
- model: senseable/WestLake-7B-v2 |
|
layer_range: [0, 16] |
|
- sources: |
|
- model: splice |
|
layer_range: [0, 8] |
|
- sources: |
|
- model: Chunky-Lemon-Cookie-11B |
|
layer_range: [16, 48] |
|
merge_method: passthrough |
|
dtype: float32 |
|
name: Honey-Yuzu-Mod-13B |
|
|
|
``` |
|
|
|
Meaningless EQ-Bench results at Q4KL: |
|
``` |
|
This model: 78.7 |
|
matchaaaaa/Honey-Yuzu-13B: 77.64 |
|
Sao10K/MN-12B-Lyra-v1 with Mistral prompt: 78.4-78.7 |
|
senseable/WestLake-7B-v2 at Q6K: 79.15 (official EQ-Bench site reports 78.7) |
|
``` |