Llama-3.2-Kapusta-3B-v8
Small and useful.
This is an interesting merge of 14 cool models, created using mergekit. Enjoy exploring :)
Merge Details
Method
This model was merged using the multistep process and remerge with some model variations for best result.
Models
The following models were included in the merge:
- bunnycore/Llama-3.2-3B-Creative
- bunnycore/Llama-3.2-3B-Mix
- bunnycore/Llama-3.2-3B-Pure-RP
- bunnycore/Llama-3.2-3B-Stock
- bunnycore/Llama-3.2-3B-TitanFusion-v2
- CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct
- Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
- Hastagaras/L3.2-JametMini-3B-MK.III
- huihui-ai/Llama-3.2-3B-Instruct-abliterated
- Lyte/Llama-3.2-3B-Overthinker
- passing2961/Thanos-3B
- SaisExperiments/Evil-Alpaca-3B-L3.2
- ValiantLabs/Llama3.2-3B-Enigma
- ValiantLabs/Llama3.2-3B-ShiningValiant2
Configuration
The following YAML configurations was used to produce this model:
# A-3B-v1
models:
- model: Hastagaras/L3.2-JametMini-3B-MK.III
- model: huihui-ai/Llama-3.2-3B-Instruct-abliterated
merge_method: model_stock
base_model: Lyte/Llama-3.2-3B-Overthinker
dtype: bfloat16
# B-3B-v1
models:
- model: ValiantLabs/Llama3.2-3B-ShiningValiant2
- model: bunnycore/Llama-3.2-3B-Stock
merge_method: model_stock
base_model: Lyte/Llama-3.2-3B-Overthinker
dtype: bfloat16
# C-3B-v1
models:
- model: bunnycore/Llama-3.2-3B-Pure-RP
- model: CarrotAI/Llama-3.2-Rabbit-Ko-3B-Instruct
merge_method: model_stock
base_model: ValiantLabs/Llama3.2-3B-ShiningValiant2
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v1
models:
- model: A-3B-v1
parameters:
density: [0.8, 0.5, 0.2]
weight: 0.8
- model: B-3B-v1
parameters:
density: [0.2, 0.8, 0.2]
weight: 0.25
- model: C-3B-v1
parameters:
density: [0.2, 0.5, 0.8]
weight: 0.6
merge_method: ties
base_model: bunnycore/Llama-3.2-3B-Mix
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v2
models:
- model: SaisExperiments/Evil-Alpaca-3B-L3.2
- model: bunnycore/Llama-3.2-3B-TitanFusion-v2
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v1
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v3
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v2
parameters:
weight: [0.5, 0.6, 0.4, 0.7, 0.3, 0.8, 0.2, 0.9, 0.1, 0.9, 0.1, 0.9, 0.1, 0.8, 0.2, 0.7, 0.3, 0.6, 0.4, 0.5]
density: [0.2, 0.8, 0.2]
merge_method: della
parameters:
epsilon: 0.1
lambda: 0.5
base_model: F:/3b/Llama-3.2-Kapusta-3B-v1
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v4A | della
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
parameters:
weight: 0.6
density: 0.5
- model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
parameters:
weight: 0.5
density: [0.4, 0.3, 0.3, 0.3]
- model: bunnycore/Llama-3.2-3B-Creative
parameters:
weight: 0.3
density: [0.3, 0.4, 0.3, 0.3]
- model: ValiantLabs/Llama3.2-3B-Enigma
parameters:
weight: 0.3
density: [0.3, 0.3, 0.4, 0.3]
- model: passing2961/Thanos-3B
parameters:
weight: 0.3
density: [0.3, 0.3, 0.3, 0.4]
merge_method: della
parameters:
epsilon: 0.2
lambda: 0.5
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v4B | breadcrumbs
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
parameters:
weight: 0.6
density: 0.5
- model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
parameters:
weight: 0.5
density: [0.4, 0.3, 0.3, 0.3]
- model: bunnycore/Llama-3.2-3B-Creative
parameters:
weight: 0.3
density: [0.3, 0.4, 0.3, 0.3]
- model: ValiantLabs/Llama3.2-3B-Enigma
parameters:
weight: 0.3
density: [0.3, 0.3, 0.4, 0.3]
- model: passing2961/Thanos-3B
parameters:
weight: 0.3
density: [0.3, 0.3, 0.3, 0.4]
merge_method: breadcrumbs
parameters:
gamma: 0.02
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v4C | dare_ties
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
parameters:
weight: 0.6
density: 0.5
- model: Devarui379/VersatiLlama-Llama-3.2-3B-Instruct-Abliterated
parameters:
weight: 0.5
density: [0.4, 0.3, 0.3, 0.3]
- model: bunnycore/Llama-3.2-3B-Creative
parameters:
weight: 0.3
density: [0.3, 0.4, 0.3, 0.3]
- model: ValiantLabs/Llama3.2-3B-Enigma
parameters:
weight: 0.3
density: [0.3, 0.3, 0.4, 0.3]
- model: passing2961/Thanos-3B
parameters:
weight: 0.3
density: [0.3, 0.3, 0.3, 0.4]
merge_method: dare_ties
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v5
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v4A
- model: F:/3b/Llama-3.2-Kapusta-3B-v4B
- model: F:/3b/Llama-3.2-Kapusta-3B-v4C
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v3
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v6
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v3
merge_method: slerp
base_model: F:/3b/Llama-3.2-Kapusta-3B-v5
dtype: bfloat16
parameters:
t: [0.5, 0.6, 0.4, 0.7, 0.3, 0.8, 0.2, 0.8, 0.2, 0.7, 0.3, 0.6, 0.4, 0.5]
# Llama-3.2-Kapusta-3B-v7
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
- model: F:/3b/Llama-3.2-Kapusta-3B-v2
- model: F:/3b/Llama-3.2-Kapusta-3B-v3
- model: F:/3b/Llama-3.2-Kapusta-3B-v4A
- model: F:/3b/Llama-3.2-Kapusta-3B-v4B
- model: F:/3b/Llama-3.2-Kapusta-3B-v4C
- model: F:/3b/Llama-3.2-Kapusta-3B-v5
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v6
dtype: bfloat16
# Llama-3.2-Kapusta-3B-v8
models:
- model: F:/3b/Llama-3.2-Kapusta-3B-v1
- model: F:/3b/Llama-3.2-Kapusta-3B-v3
- model: F:/3b/Llama-3.2-Kapusta-3B-v5
merge_method: model_stock
base_model: F:/3b/Llama-3.2-Kapusta-3B-v7
dtype: bfloat16
My thanks to the authors of the original models, your work is incredible. Have a good time 🖤
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Khetterman/Llama-3.2-Kapusta-3B-v8
Merge model
this model