|
--- |
|
base_model: |
|
- RozGrov/NemoDori-v0.2-12B-MN-BT |
|
- unsloth/Mistral-Nemo-Instruct-2407 |
|
- UsernameJustAnother/Nemo-12B-Marlin-v5 |
|
- crestf411/nemo-sunfall-v0.6.1 |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
pipeline_tag: text-generation |
|
--- |
|
# NemoDori-v0.2.1-12B-MN-BT |
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
The first child from [NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT). |
|
|
|
**The purpose** is to find a way to increase v0.2 capability to stay **aware of the past conversations** and **follow instructions better**, especially the last one (depth-0), |
|
while keeping it's **creativity and capability to (E)RP**. |
|
This model is one of the few childs to try to fulfill that. |
|
|
|
In my very short testing so far, I haven't found anything that's different from the parent and worth mentioning. But I think this version is **slightly degraded** somehow, |
|
(I don't quite know it, I just felt like it did). Anyway, try it as you may, but I think it's parent (**v0.2**) is **better** than this one. |
|
|
|
The other child ([**v0.2.2**](https://huggingface.co/RozGrov/NemoDori-v0.2.2-12B-MN-ties)) is out. |
|
I tested it more than this model and it seems to be improved better than this model, but the response format is not very consistent. |
|
|
|
You may give me feedback on anything, or guide me how I can fulfill my-*ahem* it's purpose while keeping it as low as not-70B. |
|
<br> |
|
Fine-tune is... pretty expensive for me, and I'm not ready for that (yet, tho i'm interested). |
|
|
|
<p style="font-size: 11px; margin-top: 11px" id="heya-im-a-bit-of-a-programmer"> |
|
(listen, between you and me, i still don't get it. still learning this new hobby of mine, and it's kind of refreshing in a way. |
|
i'll be exploring more other architectures in the future. Yet, this is about how random i pick my straw, just to see how lucky i am.) |
|
<br> |
|
(although, i am interested to learn how to make a new merge method. |
|
similar to when i'm making a solution for solving specific problem just like good ol days. |
|
<span style="color: darkred">but hell, this llm stuff is really expensive.</span>) |
|
</p> |
|
|
|
|
|
## Merge Details |
|
|
|
### Merge Method |
|
|
|
This model was merged using the `breadcrumbs_ties` merge method using [RozGrov/NemoDori-v0.2-12B-MN-BT](https://huggingface.co/RozGrov/NemoDori-v0.2-12B-MN-BT) as a base. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* [unsloth/Mistral-Nemo-Instruct-2407](https://huggingface.co/unsloth/Mistral-Nemo-Instruct-2407) |
|
* [UsernameJustAnother/Nemo-12B-Marlin-v5](https://huggingface.co/UsernameJustAnother/Nemo-12B-Marlin-v5) |
|
* [crestf411/nemo-sunfall-v0.6.1](https://huggingface.co/crestf411/nemo-sunfall-v0.6.1) |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
|
|
models: |
|
- model: crestf411/nemo-sunfall-v0.6.1 |
|
parameters: |
|
weight: 0.33 |
|
- model: UsernameJustAnother/Nemo-12B-Marlin-v5 |
|
parameters: |
|
weight: 0.2 |
|
- model: unsloth/Mistral-Nemo-Instruct-2407 |
|
parameters: |
|
weight: 0.37 |
|
- model: RozGrov/NemoDori-v0.2-12B-MN-BT |
|
parameters: |
|
weight: 1 |
|
merge_method: breadcrumbs_ties |
|
base_model: RozGrov/NemoDori-v0.2-12B-MN-BT |
|
parameters: |
|
density: 0.93 |
|
gamma: 0.015 |
|
dtype: bfloat16 |
|
|
|
``` |