Update README.md
Browse files
README.md
CHANGED
@@ -27,6 +27,11 @@ We take ChatML as our chat template:
|
|
27 |
|
28 |
As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mixtral with any framework and algorithm.
|
29 |
|
|
|
|
|
|
|
|
|
|
|
30 |
## License
|
31 |
|
32 |
The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.
|
|
|
27 |
|
28 |
As we merged the predictors for FFN neurons in models, you can finetune TurboSparse-Mixtral with any framework and algorithm.
|
29 |
|
30 |
+
## Limitations
|
31 |
+
* TurboSparse, having just undergone training with 150B tokens, may still exhibit performance gaps in certain tasks.
|
32 |
+
* The TurboSparse model has only been trained on English-language datasets, hence its capabilities in other languages are still lacking.
|
33 |
+
* The model may produce unexpected outputs due to its small size and probabilistic generation paradigm.
|
34 |
+
|
35 |
## License
|
36 |
|
37 |
The model is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage.
|