File size: 2,369 Bytes
10aba99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
license: other
license_name: yi-license
license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
tags:
- merge
---
# Kyllene 34B v1.1

![image/png](https://huggingface.co/TeeZee/Kyllene-34B-v1.1/resolve/main/Kyllene_v1.1.jpg)


## Model Details

- A result of new merge method provided by [MergeMonster](https://github.com/Gryphe/MergeMonster/) tool with extended RPG preset.
- models used for merge:
  [jondurbin/bagel-dpo-34b-v0.2](https://huggingface.co/jondurbin/bagel-dpo-34b-v0.2)
  [NousResearch/Nous-Capybara-34B](https://huggingface.co/NousResearch/Nous-Capybara-34B)
  [NousResearch_Nous-Hermes-2-Yi-34B](https://huggingface.co/NousResearch/Nous-Hermes-2-Yi-34B)
  [SUSTech/SUS-Chat-34B](https://huggingface.co/SUSTech/SUS-Chat-34B)
- Method is aimed to maximize probability of certain phrases and minimize probablility of other phrases.
- RPG preset was extened with examples of typical, nonsensical output of most models like 'unbreakable bond', 'send shivers down her spine' etc.
- The resulting model has approximately 34 billion parameters.
- See [mergekit-config.yml](https://huggingface.co/TeeZee/Kyllene-34B-v1.1/resolve/main/merge-config.yml) for details on the merge method used and RPG presets.

**Warning: This model can produce NSFW content!**

## Results

- produces SFW nad NSFW content without issues, switches context seamlessly.
- 200K context length
- good at following instructions
- different than [TeeZee/Kyllene-57B-v1.0](https://huggingface.co/TeeZee/Kyllene-57B-v1.0), but also surprisingly entertaining (but more tests are needed)

## Side notes

 - [MergeMonster](https://github.com/Gryphe/MergeMonster/) method works, however project would benefit greatly from some more love from developers.
 - In its current state MergeMonster consumes insane amounts of RAM (256GB+) or VRAM and takes a really long time to process model data, this merge took 24H on 1xADA6000
 - MergeMonster is not a golden bullet, other experiments has shown that it can also produce incredibly stupid models.

All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel:
<a href="https://www.buymeacoffee.com/TeeZee" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>