Crystalcareai commited on
Commit
314568f
1 Parent(s): 550faaf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -3
README.md CHANGED
@@ -1,3 +1,12 @@
1
- <p align="center">
2
- <img src="https://huggingface.co/Crystalcareai/LlaMoE-Medium/resolve/main/resources/ddb-nye2T3C3vZwJJm1l6A.png" width="350" title="hover text">
3
- </p>
 
 
 
 
 
 
 
 
 
 
1
+ <p align="center"> <img src="https://huggingface.co/Crystalcareai/LlaMoE-Medium/resolve/main/resources/ddb-nye2T3C3vZwJJm1l6A.png" width="350" title="LlaMoE-Medium model image"> </p>
2
+
3
+ This is a 4x8b Llama Mixture of Experts (MoE) model. It was trained on the OpenHermes Resort dataset from the Dolphin-2.9 dataset.
4
+
5
+ The model is a combination of 4 Llama fine-tunes, using DeepSpeed-MoE's architecture. All experts are active for every token.
6
+
7
+ This is a VERY good model, somewhere in between 8B and Llama 70B in capability. Enjoy!
8
+
9
+ Thank you to:
10
+
11
+ CrusoeEnergy for sponsoring the compute for this project
12
+ My collaborators Eric Hartford and Fernando (has too many names) Neto