Update README.md
Browse files
README.md
CHANGED
@@ -1,130 +1,23 @@
|
|
1 |
---
|
2 |
-
language:
|
3 |
-
- multilingual
|
4 |
-
- af
|
5 |
-
- am
|
6 |
-
- ar
|
7 |
-
- az
|
8 |
-
- be
|
9 |
-
- bg
|
10 |
-
- bn
|
11 |
-
- ca
|
12 |
-
- ceb
|
13 |
-
- co
|
14 |
-
- cs
|
15 |
-
- cy
|
16 |
-
- da
|
17 |
-
- de
|
18 |
-
- el
|
19 |
-
- en
|
20 |
-
- eo
|
21 |
-
- es
|
22 |
-
- et
|
23 |
-
- eu
|
24 |
- fa
|
25 |
-
|
26 |
-
- fil
|
27 |
-
- fr
|
28 |
-
- fy
|
29 |
-
- ga
|
30 |
-
- gd
|
31 |
-
- gl
|
32 |
-
- gu
|
33 |
-
- ha
|
34 |
-
- haw
|
35 |
-
- hi
|
36 |
-
- hmn
|
37 |
-
- ht
|
38 |
-
- hu
|
39 |
-
- hy
|
40 |
-
- ig
|
41 |
-
- is
|
42 |
-
- it
|
43 |
-
- iw
|
44 |
-
- ja
|
45 |
-
- jv
|
46 |
-
- ka
|
47 |
-
- kk
|
48 |
-
- km
|
49 |
-
- kn
|
50 |
-
- ko
|
51 |
-
- ku
|
52 |
-
- ky
|
53 |
-
- la
|
54 |
-
- lb
|
55 |
-
- lo
|
56 |
-
- lt
|
57 |
-
- lv
|
58 |
-
- mg
|
59 |
-
- mi
|
60 |
-
- mk
|
61 |
-
- ml
|
62 |
-
- mn
|
63 |
-
- mr
|
64 |
-
- ms
|
65 |
-
- mt
|
66 |
-
- my
|
67 |
-
- ne
|
68 |
-
- nl
|
69 |
-
- no
|
70 |
-
- ny
|
71 |
-
- pa
|
72 |
-
- pl
|
73 |
-
- ps
|
74 |
-
- pt
|
75 |
-
- ro
|
76 |
-
- ru
|
77 |
-
- sd
|
78 |
-
- si
|
79 |
-
- sk
|
80 |
-
- sl
|
81 |
-
- sm
|
82 |
-
- sn
|
83 |
-
- so
|
84 |
-
- sq
|
85 |
-
- sr
|
86 |
-
- st
|
87 |
-
- su
|
88 |
-
- sv
|
89 |
-
- sw
|
90 |
-
- ta
|
91 |
-
- te
|
92 |
-
- tg
|
93 |
-
- th
|
94 |
-
- tr
|
95 |
-
- uk
|
96 |
-
- und
|
97 |
-
- ur
|
98 |
-
- uz
|
99 |
-
- vi
|
100 |
-
- xh
|
101 |
-
- yi
|
102 |
-
- yo
|
103 |
-
- zh
|
104 |
-
- zu
|
105 |
-
datasets:
|
106 |
-
- mc4
|
107 |
-
|
108 |
-
license: apache-2.0
|
109 |
---
|
|
|
110 |
|
111 |
-
|
112 |
|
113 |
-
mT5 is pretrained on the [mC4](https://www.tensorflow.org/datasets/catalog/c4#c4multilingual) corpus, covering 101 languages:
|
114 |
|
115 |
-
|
116 |
|
117 |
-
**Note**: mT5 was only pre-trained on mC4 excluding any supervised training. Therefore, this model has to be fine-tuned before it is useable on a downstream task.
|
118 |
|
119 |
-
Pretraining Dataset: [mC4](https://www.tensorflow.org/datasets/catalog/c4#c4multilingual)
|
120 |
|
121 |
-
|
122 |
|
123 |
-
|
124 |
|
125 |
-
|
126 |
|
|
|
127 |
|
128 |
-
## Abstract
|
129 |
|
130 |
-
The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We describe the design and modified training of mT5 and demonstrate its state-of-the-art performance on many multilingual benchmarks. All of the code and model checkpoints used in this work are publicly available.
|
|
|
1 |
---
|
2 |
+
language:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
- fa
|
4 |
+
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
---
|
6 |
+
***This model is intended for non-commercial use only. If you wish to use it for commercial purposes, please make sure to refer to this LinkedIn address.
|
7 |
|
8 |
+
"Viravirast" is an editor based on transformer algorithms. By visiting the Viravirast.com, you can use a Persian semantic and structural text editor.
|
9 |
|
|
|
10 |
|
11 |
+
usage
|
12 |
|
|
|
13 |
|
|
|
14 |
|
15 |
+
n_epochs = 4
|
16 |
|
17 |
+
train_batch_size = 8
|
18 |
|
19 |
+
eval_batch_size = 4
|
20 |
|
21 |
+
lr = 5e-4
|
22 |
|
|
|
23 |
|
|