--- base_model: - nothingiisreal/L3.1-8B-Celeste-V1.5 - Sao10K/Llama-3.1-8B-Stheno-v3.4 - Sao10K/L3.1-8B-Niitama-v1.1 - arcee-ai/Llama-3.1-SuperNova-Lite - akjindal53244/Llama-3.1-Storm-8B - arcee-ai/Llama-Spark - grimjim/Llama-3-Instruct-abliteration-LoRA-8B - crestf411/sunfall-peft tags: - llama - merge - llama3 - mixtral library_name: transformers --- # Llama-3.1-Celestial-Stone-2x8B (BF16) * *Mixture of Experts (14B).* ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/lBrXRa3sVRinE3cabs-oQ.png) Both experts are used in tandem when generating a token. ------------------------------------------------------------------------------ *The first expert* is Instruct 405B distillation/RP vector merge (Supernova-Lite, Niitama1.1, Storm) *The second expert* is ERP/Reddit data merge (Celeste1.5, Stheno3.4, Storm) ------------------------------------------------------------------------------- *The base model* is Sao10k/L3.1-Stheno-3.4 with the Sunfall LoRa 0.6.1 to make it understand SillyTavern prompts and storywriting better. ------------------------------------------------------------------------------- # Prompt Template: ```bash <|begin_of_text|><|start_header_id|>system<|end_header_id|> {system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|> {input}<|eot_id|><|start_header_id|>assistant<|end_header_id|> {output}<|eot_id|> ``` # Recipe (I'm sorry...): ```yaml slices: - sources: - model: Sao10K/L3.1-8B-Niitama-v1.1+grimjim/Llama-3-Instruct-abliteration-LoRA-8B layer_range: [0, 32] - model: akjindal53244/Llama-3.1-Storm-8B layer_range: [0, 32] merge_method: nearswap base_model: Sao10K/L3.1-8B-Niitama-v1.1+grimjim/Llama-3-Instruct-abliteration-LoRA-8B parameters: t: - value: 0.0001 dtype: bfloat16 out_type: float16 #oops slices: - sources: - model: v000000/Llama-3.1-8B-Stheno-v3.4-abliterated layer_range: [0, 32] - model: akjindal53244/Llama-3.1-Storm-8B layer_range: [0, 32] merge_method: slerp base_model: v000000/Llama-3.1-8B-Stheno-v3.4-abliterated parameters: t: - filter: self_attn value: [0.1, 0.6, 0.3, 0.8, 0.5] - filter: mlp value: [0.9, 0.4, 0.7, 0.2, 0.5] - value: 0.5 dtype: float32 models: - model: arcee-ai/Llama-3.1-SuperNova-Lite parameters: weight: 1.0 - model: v000000/L3.1-Niitorm-8B-t0.0001 parameters: weight: 0.4 merge_method: task_arithmetic base_model: arcee-ai/Llama-3.1-SuperNova-Lite parameters: normalize: false dtype: float16 models: - model: arcee-ai/Llama-3.1-SuperNova-Lite parameters: weight: 0.0 - model: v000000/L3.1-Niitorm-8B-t0.0001 parameters: weight: 1.25 merge_method: task_arithmetic base_model: arcee-ai/Llama-3.1-SuperNova-Lite parameters: normalize: false dtype: float16 models: - model: v000000/L3.1-8B-RP-Test-003-Task_Arithmetic merge_method: slerp base_model: v000000/L3.1-8B-RP-Test-002-Task_Arithmetic+grimjim/Llama-3-Instruct-abliteration-LoRA-8B # This model needed some abliteration^ parameters: t: - value: [0, 0, 0.3, 0.4, 0.5, 0.6, 0.5, 0.4, 0.3, 0, 0] dtype: float16 base_model: nothingiisreal/L3.1-8B-Celeste-V1.5+grimjim/Llama-3-Instruct-abliteration-LoRA-8B dtype: bfloat16 merge_method: task_arithmetic parameters: normalize: false slices: - sources: - layer_range: [0, 32] model: nothingiisreal/L3.1-8B-Celeste-V1.5+grimjim/Llama-3-Instruct-abliteration-LoRA-8B parameters: weight: 0.7 - layer_range: [0, 32] model: v000000/L3.1-Sthenorm-8B parameters: weight: 0.2 - layer_range: [0, 32] model: nothingiisreal/L3.1-8B-Celeste-V1.5 parameters: weight: 0.2 base_model: crestf411/L3.1-8B-sunfall-stheno-v0.6.1 experts_per_token: 2 local_experts: 2 gate_mode: random dtype: bfloat16 experts: - source_model: v000000/L3.1-Storniitova-8B - source_model: x0000001/l3.1-part_aaa ```