sultan commited on
Commit
15a9455
1 Parent(s): d3eaa97

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -4
README.md CHANGED
@@ -11,7 +11,6 @@ This model adapts T5 on the Arabic Language by pre-training T5 on :
11
 
12
  Total Corpora size is 17GB. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) and uses T5x for pre-training [Link](https://github.com/google-research/t5x)
13
 
14
-
15
  ## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
16
 
17
  | Model | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware |Training Steps | Batch | Train x Batch Factor |Corpora |
@@ -49,10 +48,13 @@ You can download the full details of our grid search for all models in all tasks
49
 
50
  For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
51
 
 
 
 
 
 
 
52
 
53
- # Continual Pre-Training of ArabicT5 with T5x
54
- if you want to continue pre-training ArabicT5 on your own data, we have uploaded the raw t5x checkpoint to this link https://huggingface.co/sultan/ArabicT5-49GB-base/blob/main/arabict5_49GB_base_t5x.tar.gz
55
- We will soon share a tutorial on how you can do that for free with Kaggle TPU
56
 
57
 
58
 
@@ -62,6 +64,12 @@ We will soon share a tutorial on how you can do that for free with Kaggle TPU
62
 
63
  [COLAB]: https://colab.research.google.com/assets/colab-badge.svg
64
 
 
 
 
 
 
 
65
  ## GitHub Page
66
 
67
  https://github.com/salrowili/ArabicT5
 
11
 
12
  Total Corpora size is 17GB. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) and uses T5x for pre-training [Link](https://github.com/google-research/t5x)
13
 
 
14
  ## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
15
 
16
  | Model | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware |Training Steps | Batch | Train x Batch Factor |Corpora |
 
48
 
49
  For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
50
 
51
+ # FineTuning our efficient ArabicT5-49GB-Small model with Torch on 3070 laptop GPU ###
52
+
53
+ If you are running your code on a laptop GPU (e.g., a gaming laptop) or limited GPU memory, we recommended using our ArabicT5-49GB-Small model, which was the only model from the list that we were able to run on 3070 Laptop card with a batch size of 8. We manage to achieve an F1 score of 85.391 (slightly better than our FLAX code ) on the TyDi QA task. See the notebook below for reference :
54
+
55
+ [![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/ArabicT5/blob/main/ArabicT5_49GB_Small_on_3070_Laptop_GPU.ipynb)
56
+
57
 
 
 
 
58
 
59
 
60
 
 
64
 
65
  [COLAB]: https://colab.research.google.com/assets/colab-badge.svg
66
 
67
+
68
+ # Continual Pre-Training of ArabicT5 with T5x
69
+ if you want to continue pre-training ArabicT5 on your own data, we have uploaded the raw t5x checkpoint to this link https://huggingface.co/sultan/ArabicT5-49GB-base/blob/main/arabict5_49GB_base_t5x.tar.gz
70
+ We will soon share a tutorial on how you can do that for free with Kaggle TPU
71
+
72
+
73
  ## GitHub Page
74
 
75
  https://github.com/salrowili/ArabicT5