macadeliccc
/

laser-dolphin-mixtral-2x7b-dpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

macadeliccc commited on Jan 20

Commit

dbddb2d

•

1 Parent(s): 8b881f7

Update README.md

Files changed (1) hide show

README.md +16 -5

README.md CHANGED Viewed

@@ -6,23 +6,34 @@ library_name: transformers
 ![laser_dolphin_image](./dolphin_moe.png)
-**New Version will be uploaded soon**
 Credit to Fernando Fernandes and Eric Hartford for their project [laserRMT](https://github.com/cognitivecomputations/laserRMT)
 This model is a medium-sized MoE implementation based on [cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser)
-A 2x7b configuration offers better performance than a standard 7b model even if loaded in 4 bit. (9G VRAM)
-If this 2x7b model is loaded in 4 bit the hellaswag score is .8270 which is higher than the base model achieves on its own in full precision.
-The process is outlined in this [notebook](https://github.com/cognitivecomputations/laserRMT/blob/main/examples/laser-dolphin-mixtral-2x7b.ipynb)
 **These Quants will result in unpredicted behavior and I am working on new Quants as I have updated the model**
 Quatizations provided by [TheBloke](https://huggingface.co/TheBloke/laser-dolphin-mixtral-2x7b-dpo-GGUF)
 ## Code Example
 Switch the commented model definition to use in 4-bit. Should work with 9GB and still exceed the single 7B model by 5-6 points roughly

 ![laser_dolphin_image](./dolphin_moe.png)
+**New Version out now!**
 Credit to Fernando Fernandes and Eric Hartford for their project [laserRMT](https://github.com/cognitivecomputations/laserRMT)
+## Overview
 This model is a medium-sized MoE implementation based on [cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser)
+## Process
++ The process is outlined in this [notebook](https://github.com/cognitivecomputations/laserRMT/blob/main/examples/laser-dolphin-mixtral-2x7b.ipynb)
++ The mergekit_config is in the files.
++ The models used in the configuration are not lasered, but the final product is. This is an update from the last version.
++ This process is experimental. Your mileage may vary.
+## Quantizations
 **These Quants will result in unpredicted behavior and I am working on new Quants as I have updated the model**
 Quatizations provided by [TheBloke](https://huggingface.co/TheBloke/laser-dolphin-mixtral-2x7b-dpo-GGUF)
+*Current [Quantizations](https://huggingface.co/macadeliccc/laser-dolphin-mixtral-2x7b-GGUF)*
+- Q4_K_M
+- Q5_K_M
 ## Code Example
 Switch the commented model definition to use in 4-bit. Should work with 9GB and still exceed the single 7B model by 5-6 points roughly