God this sucked to make, I'm getting burnout with this fucker so I'm releasing what I have as a partial update. I'll be focusing on L3.1 for the time being and taking a break from this project for a bit. What's done is... fine, I'm not super happy about it but it's objectively better then v0.1.
Quants
GGUFs by BackyardAI
Details & Recommended Settings
(Still testing; subject to change)
Very experimental, expect bugs. Thrives at more story heavy and narrative RP yet still excels with the basics like usual. Way more horny this time, no idea why. Slightly worse instruct following compared to v0.1. Really needs examples to fuction, otherwise it'll spit out garble easily. I tried to also curb l3's tendency to double line (leaving a space between paragraphs) to mild success. Calmer writing style.
Has a certain tendency to speak for the {user}, but that's easily negated with a few instructs.
Rec. Settings:
Template: Plain Text or L3
Temperature: 1.3
Min P: 0.1
Repeat Penalty: 1.05
Repeat Penalty Tokens: 256
Models Merged & Merge Theory
The following models were included in the merge:
- ResplendentAI/Nymph_8B
- TheDrummer/Llama-3SOME-8B-v2
- nothingiisreal/L3-8B-Instruct-Abliterated-DWP
- Sao10K/L3-8B-Niitama-v1
- OEvortex/Emotional-llama-8B
- ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
- nothingiisreal/L3-8B-Celeste-V1.2
can't be asked rn
Config
slices:
- sources:
- layer_range: [0, 6] #6
model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
- sources:
- layer_range: [4, 8] #10
model: nothingiisreal/L3-8B-Celeste-V1
parameters:
scale:
- filter: q_proj
value: [1, 0.89]
- sources:
- layer_range: [8, 12] #14
model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
parameters:
scale:
- filter: k_proj
value: [0.89, 1]
- sources:
- layer_range: [6, 12] #20
model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
- sources:
- layer_range: [10, 14] #24
model: nothingiisreal/L3-8B-Celeste-V1
- sources:
- layer_range: [12, 16] #28
model: ArliAI/ArliAI-Llama-3-8B-Formax-v1.0
- sources:
- layer_range: [14, 28] #42
model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
- sources:
- layer_range: [21, 31] #52
model: nothingiisreal/L3-8B-Celeste-V1
- sources:
- layer_range: [28, 32] #56
model: nothingiisreal/L3-8B-Instruct-Abliterated-DWP
parameters:
int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: anterosv2.b
---
slices:
- sources:
- layer_range: [0, 4] #4
model: ResplendentAI/Nymph_8B
- sources:
- layer_range: [2, 10] #12
model: Sao10K/L3-8B-Niitama-v1
- sources:
- layer_range: [8, 12] #16
model: TheDrummer/Llama-3SOME-8B-v2
- sources:
- layer_range: [10, 16] #22
model: Sao10K/L3-8B-Niitama-v1
- sources:
- layer_range: [12, 20] #30
model: ResplendentAI/Nymph_8B
- sources:
- layer_range: [14, 18] #34
model: OEvortex/Emotional-llama-8B
- sources:
- layer_range: [16, 20] #38
model: ResplendentAI/Nymph_8B
- sources:
- layer_range: [18, 22] #42
model: OEvortex/Emotional-llama-8B
- sources:
- layer_range: [20, 24] #46
model: TheDrummer/Llama-3SOME-8B-v2
- sources:
- layer_range: [23, 31] #54
model: ResplendentAI/Nymph_8B
- sources:
- layer_range: [30, 32] #56
model: Sao10K/L3-8B-Niitama-v1
parameters:
int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: anterosv2.c
---
slices:
- sources:
- layer_range: [2, 17]
model: anterosv2.b
parameters:
int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 2-17av2b.sl
---
slices:
- sources:
- layer_range: [2, 17]
model: anterosv2.c
parameters:
int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 2-17av2c.sl
---
models:
- model: 2-17av2c.sl
parameters:
weight: [0.1, 0.5, 0.3]
- model: 2-17av2b.sl
parameters:
weight: [0.9, 0.5, 0.7]
merge_method: dare_linear
base_model: 2-17av2b.sl
parameters:
normalize: false
int8_mask: true
dtype: float32
out_dtype: bfloat16
name: 2-17aav2.sl
---
slices:
- sources:
- layer_range: [17, 42]
model: anterosv2.b
parameters:
int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 17-42av2b.sl
---
slices:
- sources:
- layer_range: [17, 42]
model: anterosv2.c
parameters:
int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 17-42av2c.sl
---
models:
- model: 17-42av2b.sl
parameters:
weight: [0.8, 0.5, 0.4, 0.3, 0.15]
density: 0.65
epsilon: 0.07
lambda: 0.12
- model: 17-42av2c.sl
parameters:
weight: [0.2, 0.5, 0.6, 0.7, 0.85]
density: 0.7
epsilon: 0.05
lambda: 0.1
merge_method: della
base_model: 17-42av2c.sl
parameters:
normalize: false
int8_mask: true
dtype: float32
out_dtype: bfloat16
name: 17-42aav2.sl
---
slices:
- sources:
- layer_range: [42, 52]
model: anterosv2.b
parameters:
int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 42-52av2b.sl
---
slices:
- sources:
- layer_range: [42, 52]
model: anterosv2.c
parameters:
int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: 42-52av2c.sl
---
models:
- model: 42-52av2c.sl
parameters:
weight: [0.9, 0.65, 0.9]
- model: 42-52av2b.sl
parameters:
weight: [0.1, 0.35, 0.1]
merge_method: dare_linear
base_model: 42-52av2c.sl
parameters:
normalize: false
int8_mask: true
dtype: float32
out_dtype: bfloat16
name: 42-52aav2.sl
---
slices:
- sources:
- layer_range: [0, 2]
model: anterosv2.b
- sources:
- layer_range: [0, 15]
model: 2-17aav2.sl
- sources:
- layer_range: [0, 25]
model: 17-42aav2.sl
- sources:
- layer_range: [0, 10]
model: 42-52aav2.sl
- sources:
- layer_range: [52, 56]
model: anterosv2.c
parameters:
int8_mask: true
merge_method: passthrough
dtype: float32
out_dtype: bfloat16
name: anterosv0.1.5
- Downloads last month
- 11