cgus
/

NinjaMouse-2.4B-32L-danube-exl2

Text Generation

Model card Files Files and versions Community

cgus commited on Apr 4

Commit

4d3b2f9

•

1 Parent(s): 742c82c

Update README.md

Files changed (1) hide show

README.md +28 -0

README.md CHANGED Viewed

@@ -22,10 +22,38 @@ datasets:
 - Mihaiii/OpenHermes-2.5-1k-longest-curated
 language:
 - en
 library_name: transformers
 tags:
 - code
 ---
 # Model Card for NinjaMouse-2.4B-32L-danube
 A lanky version of [h2o-danube](https://huggingface.co/h2oai/h2o-danube-1.8b-chat)'s tiny language model, stretched from 24 layers to 32. I have done this in steps, adding 2 new layers per step and training them on different datasets. This seems to have made it a quick learner, and easily fit an 8GB GPU for finetuning using Unsloth for optimizations. This model is designed to be a gateway into bigger language models.

 - Mihaiii/OpenHermes-2.5-1k-longest-curated
 language:
 - en
+inference: false
 library_name: transformers
 tags:
 - code
 ---
+#  NinjaMouse-2.4B-32L-danube-exl2
+Original model: [NinjaMouse-2.4B-32L-danube](https://huggingface.co/trollek/NinjaMouse-2.4B-32L-danube)
+Model creator: [trollek](https://huggingface.co/trollek)
+## Quants
+[4bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/main)
+[4.25bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/4.25bpw-h6)
+[4.65bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/4.65bpw-h6)
+[5bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/5bpw-h6)
+[6bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/6bpw-h6)
+[8bpw h8](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/8bpw-h8)
+## Quantization notes
+Made with exllamav2 0.0.15 with default dataset. I'm very unsure about context size of this model.
+For me it breaks past 3000 context, at about 3500 or so. Both with these quants and the creator's GGUF files.
+At first I thought I had some quantization issues but it's probably just the model itself.
+## How to run
+This quantization method uses GPU and requires Exllamav2 loader which can be found in following applications:
+[Text Generation Webui](https://github.com/oobabooga/text-generation-webui)
+[KoboldAI](https://github.com/henk717/KoboldAI)
+[ExUI](https://github.com/turboderp/exui)
+# Original model card
 # Model Card for NinjaMouse-2.4B-32L-danube
 A lanky version of [h2o-danube](https://huggingface.co/h2oai/h2o-danube-1.8b-chat)'s tiny language model, stretched from 24 layers to 32. I have done this in steps, adding 2 new layers per step and training them on different datasets. This seems to have made it a quick learner, and easily fit an 8GB GPU for finetuning using Unsloth for optimizations. This model is designed to be a gateway into bigger language models.