Update README.md
Browse files
README.md
CHANGED
@@ -22,10 +22,38 @@ datasets:
|
|
22 |
- Mihaiii/OpenHermes-2.5-1k-longest-curated
|
23 |
language:
|
24 |
- en
|
|
|
25 |
library_name: transformers
|
26 |
tags:
|
27 |
- code
|
28 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
29 |
# Model Card for NinjaMouse-2.4B-32L-danube
|
30 |
|
31 |
A lanky version of [h2o-danube](https://huggingface.co/h2oai/h2o-danube-1.8b-chat)'s tiny language model, stretched from 24 layers to 32. I have done this in steps, adding 2 new layers per step and training them on different datasets. This seems to have made it a quick learner, and easily fit an 8GB GPU for finetuning using Unsloth for optimizations. This model is designed to be a gateway into bigger language models.
|
|
|
22 |
- Mihaiii/OpenHermes-2.5-1k-longest-curated
|
23 |
language:
|
24 |
- en
|
25 |
+
inference: false
|
26 |
library_name: transformers
|
27 |
tags:
|
28 |
- code
|
29 |
---
|
30 |
+
# NinjaMouse-2.4B-32L-danube-exl2
|
31 |
+
Original model: [NinjaMouse-2.4B-32L-danube](https://huggingface.co/trollek/NinjaMouse-2.4B-32L-danube)
|
32 |
+
Model creator: [trollek](https://huggingface.co/trollek)
|
33 |
+
|
34 |
+
## Quants
|
35 |
+
[4bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/main)
|
36 |
+
[4.25bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/4.25bpw-h6)
|
37 |
+
[4.65bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/4.65bpw-h6)
|
38 |
+
[5bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/5bpw-h6)
|
39 |
+
[6bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/6bpw-h6)
|
40 |
+
[8bpw h8](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/8bpw-h8)
|
41 |
+
|
42 |
+
## Quantization notes
|
43 |
+
Made with exllamav2 0.0.15 with default dataset. I'm very unsure about context size of this model.
|
44 |
+
For me it breaks past 3000 context, at about 3500 or so. Both with these quants and the creator's GGUF files.
|
45 |
+
At first I thought I had some quantization issues but it's probably just the model itself.
|
46 |
+
|
47 |
+
## How to run
|
48 |
+
This quantization method uses GPU and requires Exllamav2 loader which can be found in following applications:
|
49 |
+
|
50 |
+
[Text Generation Webui](https://github.com/oobabooga/text-generation-webui)
|
51 |
+
|
52 |
+
[KoboldAI](https://github.com/henk717/KoboldAI)
|
53 |
+
|
54 |
+
[ExUI](https://github.com/turboderp/exui)
|
55 |
+
|
56 |
+
# Original model card
|
57 |
# Model Card for NinjaMouse-2.4B-32L-danube
|
58 |
|
59 |
A lanky version of [h2o-danube](https://huggingface.co/h2oai/h2o-danube-1.8b-chat)'s tiny language model, stretched from 24 layers to 32. I have done this in steps, adding 2 new layers per step and training them on different datasets. This seems to have made it a quick learner, and easily fit an 8GB GPU for finetuning using Unsloth for optimizations. This model is designed to be a gateway into bigger language models.
|