Update README.md
Browse files
README.md
CHANGED
@@ -11,7 +11,6 @@ This model adapts T5 on the Arabic Language by pre-training T5 on :
|
|
11 |
|
12 |
Total Corpora size is 17GB. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) and uses T5x for pre-training [Link](https://github.com/google-research/t5x)
|
13 |
|
14 |
-
|
15 |
## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
|
16 |
|
17 |
| Model | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware |Training Steps | Batch | Train x Batch Factor |Corpora |
|
@@ -49,10 +48,13 @@ You can download the full details of our grid search for all models in all tasks
|
|
49 |
|
50 |
For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
|
51 |
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
|
53 |
-
# Continual Pre-Training of ArabicT5 with T5x
|
54 |
-
if you want to continue pre-training ArabicT5 on your own data, we have uploaded the raw t5x checkpoint to this link https://huggingface.co/sultan/ArabicT5-49GB-base/blob/main/arabict5_49GB_base_t5x.tar.gz
|
55 |
-
We will soon share a tutorial on how you can do that for free with Kaggle TPU
|
56 |
|
57 |
|
58 |
|
@@ -62,6 +64,12 @@ We will soon share a tutorial on how you can do that for free with Kaggle TPU
|
|
62 |
|
63 |
[COLAB]: https://colab.research.google.com/assets/colab-badge.svg
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
## GitHub Page
|
66 |
|
67 |
https://github.com/salrowili/ArabicT5
|
|
|
11 |
|
12 |
Total Corpora size is 17GB. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) and uses T5x for pre-training [Link](https://github.com/google-research/t5x)
|
13 |
|
|
|
14 |
## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
|
15 |
|
16 |
| Model | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware |Training Steps | Batch | Train x Batch Factor |Corpora |
|
|
|
48 |
|
49 |
For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
|
50 |
|
51 |
+
# FineTuning our efficient ArabicT5-49GB-Small model with Torch on 3070 laptop GPU ###
|
52 |
+
|
53 |
+
If you are running your code on a laptop GPU (e.g., a gaming laptop) or limited GPU memory, we recommended using our ArabicT5-49GB-Small model, which was the only model from the list that we were able to run on 3070 Laptop card with a batch size of 8. We manage to achieve an F1 score of 85.391 (slightly better than our FLAX code ) on the TyDi QA task. See the notebook below for reference :
|
54 |
+
|
55 |
+
[![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/ArabicT5/blob/main/ArabicT5_49GB_Small_on_3070_Laptop_GPU.ipynb)
|
56 |
+
|
57 |
|
|
|
|
|
|
|
58 |
|
59 |
|
60 |
|
|
|
64 |
|
65 |
[COLAB]: https://colab.research.google.com/assets/colab-badge.svg
|
66 |
|
67 |
+
|
68 |
+
# Continual Pre-Training of ArabicT5 with T5x
|
69 |
+
if you want to continue pre-training ArabicT5 on your own data, we have uploaded the raw t5x checkpoint to this link https://huggingface.co/sultan/ArabicT5-49GB-base/blob/main/arabict5_49GB_base_t5x.tar.gz
|
70 |
+
We will soon share a tutorial on how you can do that for free with Kaggle TPU
|
71 |
+
|
72 |
+
|
73 |
## GitHub Page
|
74 |
|
75 |
https://github.com/salrowili/ArabicT5
|