macadeliccc
/

laser-dolphin-mixtral-2x7b-dpo

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

macadeliccc commited on Feb 8

Commit

38bcb59

•

1 Parent(s): 269a2cb

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -45,6 +45,14 @@ Quatizations provided by [TheBloke](https://huggingface.co/TheBloke/laser-dolphi
 *Current AWQ [Quantizations](https://huggingface.co/macadeliccc/laser-dolphin-mixtral-2x7b-dpo-AWQ)
 ## HF Spaces
 + GGUF chat available [here](https://huggingface.co/spaces/macadeliccc/laser-dolphin-mixtral-chat-GGUF)
 + 4-bit bnb chat available [here](https://huggingface.co/spaces/macadeliccc/laser-dolphin-mixtral-chat)

 *Current AWQ [Quantizations](https://huggingface.co/macadeliccc/laser-dolphin-mixtral-2x7b-dpo-AWQ)
+# ExLlamav2
+Thanks to user [bartowski](https://huggingface.co/bartowski) we now have exllamav2 quantizations in 3.5 through 8 bpw. They are available here:
++ [bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2](https://huggingface.co/bartowski/laser-dolphin-mixtral-2x7b-dpo-exl2)
+His quantizations represent the first ~13B model with GQA support. Check out his repo for more information!
 ## HF Spaces
 + GGUF chat available [here](https://huggingface.co/spaces/macadeliccc/laser-dolphin-mixtral-chat-GGUF)
 + 4-bit bnb chat available [here](https://huggingface.co/spaces/macadeliccc/laser-dolphin-mixtral-chat)