cgus commited on
Commit
4d3b2f9
1 Parent(s): 742c82c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -0
README.md CHANGED
@@ -22,10 +22,38 @@ datasets:
22
  - Mihaiii/OpenHermes-2.5-1k-longest-curated
23
  language:
24
  - en
 
25
  library_name: transformers
26
  tags:
27
  - code
28
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  # Model Card for NinjaMouse-2.4B-32L-danube
30
 
31
  A lanky version of [h2o-danube](https://huggingface.co/h2oai/h2o-danube-1.8b-chat)'s tiny language model, stretched from 24 layers to 32. I have done this in steps, adding 2 new layers per step and training them on different datasets. This seems to have made it a quick learner, and easily fit an 8GB GPU for finetuning using Unsloth for optimizations. This model is designed to be a gateway into bigger language models.
 
22
  - Mihaiii/OpenHermes-2.5-1k-longest-curated
23
  language:
24
  - en
25
+ inference: false
26
  library_name: transformers
27
  tags:
28
  - code
29
  ---
30
+ # NinjaMouse-2.4B-32L-danube-exl2
31
+ Original model: [NinjaMouse-2.4B-32L-danube](https://huggingface.co/trollek/NinjaMouse-2.4B-32L-danube)
32
+ Model creator: [trollek](https://huggingface.co/trollek)
33
+
34
+ ## Quants
35
+ [4bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/main)
36
+ [4.25bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/4.25bpw-h6)
37
+ [4.65bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/4.65bpw-h6)
38
+ [5bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/5bpw-h6)
39
+ [6bpw h6](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/6bpw-h6)
40
+ [8bpw h8](https://huggingface.co/cgus/NinjaMouse-2.4B-32L-danube-exl2/tree/8bpw-h8)
41
+
42
+ ## Quantization notes
43
+ Made with exllamav2 0.0.15 with default dataset. I'm very unsure about context size of this model.
44
+ For me it breaks past 3000 context, at about 3500 or so. Both with these quants and the creator's GGUF files.
45
+ At first I thought I had some quantization issues but it's probably just the model itself.
46
+
47
+ ## How to run
48
+ This quantization method uses GPU and requires Exllamav2 loader which can be found in following applications:
49
+
50
+ [Text Generation Webui](https://github.com/oobabooga/text-generation-webui)
51
+
52
+ [KoboldAI](https://github.com/henk717/KoboldAI)
53
+
54
+ [ExUI](https://github.com/turboderp/exui)
55
+
56
+ # Original model card
57
  # Model Card for NinjaMouse-2.4B-32L-danube
58
 
59
  A lanky version of [h2o-danube](https://huggingface.co/h2oai/h2o-danube-1.8b-chat)'s tiny language model, stretched from 24 layers to 32. I have done this in steps, adding 2 new layers per step and training them on different datasets. This seems to have made it a quick learner, and easily fit an 8GB GPU for finetuning using Unsloth for optimizations. This model is designed to be a gateway into bigger language models.