sultan
/

ArabicT5-17GB-large

Text2Text Generation

Transformers

PyTorch

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sultan commited on Nov 4, 2023

Commit

15a9455

•

1 Parent(s): d3eaa97

Update README.md

Browse files

Files changed (1) hide show

README.md +12 -4

README.md CHANGED Viewed

@@ -11,7 +11,6 @@ This model adapts T5 on the Arabic Language by pre-training T5 on :
 Total Corpora size is 17GB. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) and uses T5x for pre-training [Link](https://github.com/google-research/t5x)
 ## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
 |     Model        | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware  |Training Steps | Batch  |  Train x Batch Factor |Corpora                 |
@@ -49,10 +48,13 @@ You can download the full details of our grid search for all models in all tasks
 For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
-# Continual Pre-Training of ArabicT5 with T5x
-if you want to continue pre-training ArabicT5 on your own data, we have uploaded the raw t5x checkpoint to this link https://huggingface.co/sultan/ArabicT5-49GB-base/blob/main/arabict5_49GB_base_t5x.tar.gz
-We will soon share a tutorial on how you can do that for free with Kaggle TPU
@@ -62,6 +64,12 @@ We will soon share a tutorial on how you can do that for free with Kaggle TPU
 [COLAB]: https://colab.research.google.com/assets/colab-badge.svg
 ## GitHub Page
 https://github.com/salrowili/ArabicT5

 Total Corpora size is 17GB. This model uses an efficient implementation of T5 which reduces the fine-tuning and memory used [Link](https://arxiv.org/abs/2109.10686) and uses T5x for pre-training [Link](https://github.com/google-research/t5x)
 ## Pre-training Settings and Results on TyDi QA Development Dataset ( Model in this card is highlighted in bold )
 |     Model        | Hidden Layer | Atten. head | Atten. Layers | Vocab | Hardware  |Training Steps | Batch  |  Train x Batch Factor |Corpora                 |
 For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
+# FineTuning our efficient ArabicT5-49GB-Small model with Torch on 3070 laptop GPU ###
+If you are running your code on a laptop GPU (e.g., a gaming laptop) or limited GPU memory, we recommended using our ArabicT5-49GB-Small model, which was the only model from the list that we were able to run on 3070 Laptop card with a batch size of 8. We manage to achieve an F1 score of 85.391 (slightly better than our FLAX code ) on the TyDi QA task. See the notebook below for reference :
+[![Open In Colab][COLAB]](https://colab.research.google.com/github/salrowili/ArabicT5/blob/main/ArabicT5_49GB_Small_on_3070_Laptop_GPU.ipynb)
 [COLAB]: https://colab.research.google.com/assets/colab-badge.svg
+# Continual Pre-Training of ArabicT5 with T5x
+if you want to continue pre-training ArabicT5 on your own data, we have uploaded the raw t5x checkpoint to this link https://huggingface.co/sultan/ArabicT5-49GB-base/blob/main/arabict5_49GB_base_t5x.tar.gz
+We will soon share a tutorial on how you can do that for free with Kaggle TPU
 ## GitHub Page
 https://github.com/salrowili/ArabicT5