ParasiticRogue's picture
Update README.md
2711d72 verified
|
raw
history blame
2.5 kB
metadata
license: other
license_name: yi-34b
license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE

Merged-Vicuna-RP-Stew-34B

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge of 4 (Technically 5) models which use some variant of the Vicuna prompting template for cohesion's sake. Besides being decent models, Capybara was chosen at a higher percentage for it's general aptitude plus preserving longer context length, Tess-1.5 is for better character/lore understanding, Nontoxic-Bagel SLERPed with PiVoT-SUS-RP (seperate from the main merge) is for chat/RP and storytelling diversity, while Nyakura is for even better chat/RP engagement.

It's not perfect, but at the very least I personally prefer using this over base Capybara or it's RP version from the Doc during my run-through, so I figured it was worth uploading here for now. Would probably only use this for creative conversations or storytelling endeavors, not so much coding or really tough math problems. Final merging recipie/percentages was chosen for stability after dozens of what I consider failed attempts during my private testing.

Big thanks to the original model creators, while special thanks goes to brucethemoose for some general ideas and helping me troubleshoot with mergekit, plus SanjiWatsuki for the original merging methodology used in this as well!

Prompt Format: Orca-Vicuna

SYSTEM: <ANY SYSTEM CONTEXT>
USER: 
ASSISTANT:

Models Merged

The following models were included in the merge:

https://huggingface.co/migtissera/Tess-34B-v1.5b

https://huggingface.co/NousResearch/Nous-Capybara-34B

https://huggingface.co/jondurbin/nontoxic-bagel-34b-v0.2

https://huggingface.co/maywell/PiVoT-SUS-RP

https://huggingface.co/Sao10K/NyakuraV2-34B-Yi-Llama

https://huggingface.co/chargoddard/Yi-34B-200K-Llama

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Tess-34B-v1.5b
    parameters:
      weight: 0.28
      density: 0.66
  - model: Nous-Capybara-34B-V1.9
    parameters:
      weight: 0.34
      density: 0.78
  - model: Nontoxic-PiVoT-Bagel-RP-34B
    parameters:
      weight: 0.22
      density: 0.54
  - model: NyakuraV2-34B-Yi-Llama
    parameters:
      weight: 0.16
      density: 0.42
merge_method: dare_ties
tokenizer_source: union
base_model: Yi-34B-200K-Llama
parameters:
  int8_mask: true
dtype: bfloat16