PowerInfer
/

TurboSparse-Mixtral-Instruct

Feature Extraction

turbosparsemixtral

Model card Files Files and versions Community

yixinsong commited on Jun 7

Commit

3ab53ca

•

1 Parent(s): 6ee8174

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -27,6 +27,11 @@ We take ChatML as our chat template:
 As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mixtral with any framework and algorithm.
 ## License
 The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.

 As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mixtral with any framework and algorithm.
+## Limitations
+* TurboSparse, having just undergone training with 150B tokens, may still exhibit performance gaps in certain tasks.
+* The TurboSparse model has only been trained on English-language datasets, hence its capabilities in other languages are still lacking.
+* The model may produce unexpected outputs due to its small size and probabilistic generation paradigm.
 ## License
 The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.