jupyterjazz
commited on
Commit
•
e3fca02
1
Parent(s):
3b9e730
readme
Browse filesSigned-off-by: jupyterjazz <[email protected]>
README.md
CHANGED
@@ -1,7 +1,12 @@
|
|
1 |
Core implementation of Jina XLM-RoBERTa
|
2 |
|
3 |
-
|
4 |
|
5 |
-
|
6 |
-
|
7 |
-
|
|
|
|
|
|
|
|
|
|
|
|
1 |
Core implementation of Jina XLM-RoBERTa
|
2 |
|
3 |
+
This implementation is adapted from [XLM-Roberta](https://huggingface.co/docs/transformers/en/model_doc/xlm-roberta). In contrast to the original implementation, this model uses Rotary positional encodings and supports flash-attention 2.
|
4 |
|
5 |
+
### Models that use this implementation
|
6 |
+
|
7 |
+
to be added soon
|
8 |
+
|
9 |
+
|
10 |
+
### Converting weights
|
11 |
+
|
12 |
+
Weights from an [original XLMRoberta model](https://huggingface.co/FacebookAI/xlm-roberta-large) can be converted using the `convert_roberta_weights_to_flash.py` script in the model repository.
|