LLaMa-3-Stheno-v3.2-15B - EXL2 8.08bpw

This is a 8bpw EXL2 quant of PJMixers/LLaMa-3-Stheno-v3.2-15B

This quant was made using exllamav2-0.0.21 with Bluemoon-light dataset for RP.

I tested this quant shortly in some random RPs (including one over 8k context - with RoPE scaling as recommended in webui, maybe with alpha_value a bit higher) and it seems to work fine.

During quanting I noticed that a lot of layers in the middle of the model had suspiciously low error values, this resulted in lower quant size as the script must have thought that these layers weren't important and used lower bpw for them. Despite this the model seems to work well, at leat for me.

Prompt Templates

Seems to use llama3 prompt template.

Original readme below

merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the task arithmetic merge method using elinas/Llama-3-15B-Instruct-zeroed as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

merge_method: task_arithmetic
dtype: bfloat16
base_model: elinas/Llama-3-15B-Instruct-zeroed
models:
  - model: elinas/Llama-3-15B-Instruct-ft-v2
    parameters:
      weight: 1.0
  - model: PJMixers/LLaMa-3-Stheno-v3.2-Zeroed-15B
    parameters:
      weight: 1.0

DeusImperator
/

LLaMa-3-Stheno-v3.2-15B_exl2_8.05bpw_rpcal_mk2