Added links
Browse files
README.md
CHANGED
@@ -10,13 +10,16 @@ datasets:
|
|
10 |
- wiki40b
|
11 |
---
|
12 |
|
13 |
-
# t5-base-japanese-web (with Byte-fallback)
|
14 |
|
15 |
## Description
|
16 |
|
17 |
[megagonlabs/t5-base-japanese-web](https://huggingface.co/megagonlabs/t5-base-japanese-web) is a T5 (Text-to-Text Transfer Transformer) model pre-trained on Japanese web texts.
|
18 |
Training codes are [available on GitHub](https://github.com/megagonlabs/t5-japanese).
|
19 |
|
|
|
|
|
|
|
20 |
### Corpora
|
21 |
|
22 |
We used following corpora for pre-training.
|
@@ -28,7 +31,6 @@ We used following corpora for pre-training.
|
|
28 |
- 828,236 articles (2,073,584 examples)
|
29 |
- 2 GB in TFRecord format
|
30 |
|
31 |
-
|
32 |
### Tokenizer
|
33 |
|
34 |
We used Japanese Wikipedia to train [SentencePiece](https://github.com/google/sentencepiece).
|
@@ -52,7 +54,6 @@ It took about 126 hours with TPU v3-8
|
|
52 |
|
53 |
Apache License 2.0
|
54 |
|
55 |
-
|
56 |
## Citations
|
57 |
|
58 |
- mC4
|
|
|
10 |
- wiki40b
|
11 |
---
|
12 |
|
13 |
+
# t5-base-japanese-web (with Byte-fallback, 32K)
|
14 |
|
15 |
## Description
|
16 |
|
17 |
[megagonlabs/t5-base-japanese-web](https://huggingface.co/megagonlabs/t5-base-japanese-web) is a T5 (Text-to-Text Transfer Transformer) model pre-trained on Japanese web texts.
|
18 |
Training codes are [available on GitHub](https://github.com/megagonlabs/t5-japanese).
|
19 |
|
20 |
+
The vocabulary size of this model is 32K.
|
21 |
+
[8K version is also available](https://huggingface.co/megagonlabs/t5-base-japanese-web-8k).
|
22 |
+
|
23 |
### Corpora
|
24 |
|
25 |
We used following corpora for pre-training.
|
|
|
31 |
- 828,236 articles (2,073,584 examples)
|
32 |
- 2 GB in TFRecord format
|
33 |
|
|
|
34 |
### Tokenizer
|
35 |
|
36 |
We used Japanese Wikipedia to train [SentencePiece](https://github.com/google/sentencepiece).
|
|
|
54 |
|
55 |
Apache License 2.0
|
56 |
|
|
|
57 |
## Citations
|
58 |
|
59 |
- mC4
|