sultan commited on
Commit
44fa119
1 Parent(s): 2a61879

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -13
README.md CHANGED
@@ -20,19 +20,23 @@ This model adapt T5 on Arabic Language by pre-training T5 on ArabicWikipedia, Ma
20
 
21
  ## Results on TyDi QA, HARD, Sentiment Analysis, Sarcasm Detection ( Best Score is highlighted in bold )
22
 
23
- | Model | <center>TyDi QA (Dev) | <center>HARD (Hotel Review) | <center>ArSarcasm-v2 (Sentiment Analysis) | <center>ArSarcasm-v2 (Sarcasm Detection) |
24
- |----------------------|---------------|---------------------|-------------------------------------|----------------------------------|
25
- | AraT5-Base | <center>70.36/84.21 |<center>96.49|<center>69.7/72.63|<center>60.44|
26
- | AraT5-Base-MSA | <center>70.90/84.00 |<center>**96.52**|<center>70.03/72.73|<center>60.69|
27
- | AraT5-Base-Tweets | <center>65.14/79.00 |<center>96.26|<center>70.67/73.52|<center>61.11|
28
- | mT5-Base | <center>72.20/84.13 |<center>96.24|<center>67.33/68.78|<center>52.18|
29
- | ArabicT5-Base | <center>70.79/84.76 |<center>96.36|<center>68.93/71.20|<center>58.93|
30
- | ArabicT5-Large | <center>73.29/86.08 |<center>96.40|<center>70.4/73.01|<center>59.79|
31
- | ArabicT5-xLarge | <center>**75.46/87.12** |<center>96.50| <center>**72.23/75.17**|<center>**61.66**|
32
-
33
- Evaluation Metrics: TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic).
34
-
35
- You can download the full details of our grid search for all models in all tasks above from this link : https://drive.google.com/file/d/1yR5sxwYZL-ugHGjeW0ueaiOIzZiOSPad/view?usp=sharing
 
 
 
 
36
 
37
  # Speedup Results
38
 
 
20
 
21
  ## Results on TyDi QA, HARD, Sentiment Analysis, Sarcasm Detection ( Best Score is highlighted in bold )
22
 
23
+ | Model | <center>TyDi QA| <center>HARD| <center>ArSarcasm-v2-Sentiment| <center>ArSarcasm-v2-Sarcasm| XL-SUM |
24
+ |----------------------|---------------|---------------------|-------------------------------------|----------------------------------|----------------------------------
25
+ | AraT5-Base | <center>70.36/84.21 |<center>96.49|<center>69.7/72.63|<center>60.44|<center>30.31|
26
+ | AraT5-Base-MSA | <center>70.90/84.00 |<center>**96.52**|<center>70.03/72.73|<center>60.69|<center>27.36|
27
+ | AraT5-Base-Tweets | <center>65.14/79.00 |<center>96.26|<center>70.67/73.52|<center>61.11|<center>25.08|
28
+ | mT5-Base | <center>72.20/84.13 |<center>96.24|<center>67.33/68.78|<center>52.18|<center>25.68|
29
+ | ArabicT5-Base | <center>70.79/84.76 |<center>96.36|<center>68.93/71.20|<center>58.93|<center>29.19|
30
+ | ArabicT5-Large | <center>73.29/86.08 |<center>96.40|<center>70.4/73.01|<center>59.79|<center>30.30|
31
+ | ArabicT5-xLarge | <center>**75.46/87.12** |<center>96.50| <center>**72.23/75.17**|<center>**61.66**|<center>**31.70**|
32
+
33
+ Evaluation Metrics: TyDi QA (EM/F1), HARD (Accuracy), Sentiment Analysis (Accuracy / F1-PN positive-negative), Sarcasm Detection (F1-sarcastic), XL-SUM (Rouge-L with Stemmer).
34
+
35
+ You can download the full details of our grid search for all models in all tasks above from this link: https://github.com/salrowili/ArabicT5/raw/main/ArabicT5_Grid_Search.zip
36
+
37
+ For the XL-Sum task, we choose our best run for each model using the eval set. We use the official evaluation script from XL-Sum, which uses the stemmer function, which may show better results than papers that don't use the stemmer function. The official XL-Sum paper uses a stemmer function.
38
+
39
+ In our XL-Sum results, although we show that AraT5-Base exceeded our ArabicT5-Large, in most runs, our ArabicT5-Large shows better results, as you can see from our grid search file.
40
 
41
  # Speedup Results
42