mav23's picture
Upload folder using huggingface_hub
19dc423 verified
|
raw
history blame
4.46 kB
metadata
base_model:
  - unsloth/Mistral-Small-Instruct-2409
  - TheDrummer/Cydonia-22B-v1.2
  - Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small
  - anthracite-org/magnum-v4-22b
  - ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1
  - spow12/ChatWaifu_v2.0_22B
  - rAIfle/Acolyte-22B
  - Envoid/Mistral-Small-NovusKyver
  - InferenceIllusionist/SorcererLM-22B
library_name: transformers
tags:
  - mergekit
  - merge
license: other
language:
  - en

Schisandra

Many thanks to the authors of the models used!

RPMax v1.1 | Pantheon-RP | Cydonia v1.2 | Magnum V4 | ChatWaifu v2.0 | SorcererLM | Acolyte | NovusKyver


The new version writes better and doesn't mispronounce names anymore!

https://huggingface.co/Nohobby/MS-Schisandra-22B-v0.2


Overview

Main uses: RP, Storywriting

Merge of 8 Mistral Small finetunes in total, which were then merged back into the original model to make it less stupid. Worked somehow? Definitely smarter than my previous MS merge and maybe some finetunes. Seems to really adhere to the writing style of the previous output, so you'll need either a good character card or an existing chat for a better replies.


Quants

Static

Imatrix


Settings

Prompt format: Mistral-V3 Tekken

Samplers: These or These


Merge Details

Merging steps

QCmix

base_model: InferenceIllusionist/SorcererLM-22B
parameters:
  int8_mask: true
  rescale: true
  normalize: false
dtype: bfloat16
tokenizer_source: base
merge_method: della
models:
  - model: Envoid/Mistral-Small-NovusKyver
    parameters:
      density: [0.35, 0.65, 0.5, 0.65, 0.35]
      epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
      lambda: 0.85
      weight: [-0.01891, 0.01554, -0.01325, 0.01791, -0.01458]
  - model: rAIfle/Acolyte-22B
    parameters:
      density: [0.6, 0.4, 0.5, 0.4, 0.6]
      epsilon: [0.15, 0.15, 0.25, 0.15, 0.15]
      lambda: 0.85
      weight: [0.01768, -0.01675, 0.01285, -0.01696, 0.01421]

Schisandra-vA

merge_method: della_linear
dtype: bfloat16
parameters:
  normalize: true
  int8_mask: true
tokenizer_source: union
base_model: TheDrummer/Cydonia-22B-v1.2
models:
    - model: ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1
      parameters:
        density: 0.55
        weight: 1
    - model: Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small
      parameters:
        density: 0.55
        weight: 1
    - model: spow12/ChatWaifu_v2.0_22B
      parameters:
        density: 0.55
        weight: 1
    - model: anthracite-org/magnum-v4-22b
      parameters:
        density: 0.55
        weight: 1
    - model: QCmix
      parameters:
        density: 0.55
        weight: 1

Schisandra

dtype: bfloat16
tokenizer_source: base
merge_method: della_linear
parameters:
  density: 0.5
base_model: Schisandra
models:
  - model: unsloth/Mistral-Small-Instruct-2409
    parameters:
      weight:
        - filter: v_proj
          value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
        - filter: o_proj
          value: [1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1]
        - filter: up_proj
          value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
        - filter: gate_proj
          value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
        - filter: down_proj
          value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        - value: 0
  - model: Schisandra
    parameters:
      weight:
        - filter: v_proj
          value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
        - filter: o_proj
          value: [0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0]
        - filter: up_proj
          value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
        - filter: gate_proj
          value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
        - filter: down_proj
          value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
        - value: 1