yhavinga commited on
Commit
7b1e22d
1 Parent(s): 454bda3

Autoupdate README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -4
README.md CHANGED
@@ -127,14 +127,11 @@ Therefore, the model can have biased predictions. This bias will also affect all
127
  The `ul2-base-nl36-en-nl` T5 model was pre-trained simultaneously on a combination of several datasets,
128
  including the `full` config of the "mc4_nl_cleaned" dataset, which is a cleaned version of Common Crawl's web
129
  crawl corpus, Dutch books, the Dutch subset of Wikipedia (2022-03-20), and a subset of "mc4_nl_cleaned"
130
- containing only texts from Dutch and Belgian newspapers. This last dataset is oversampled to bias the model
131
- towards descriptions of events in the Netherlands and Belgium.
132
 
133
  After pre-training, the model was
134
  fine-tuned on a translation dataset containing 13 million sentence and paragraph pairs
135
  sampled from books.
136
-
137
-
138
 
139
  ## Training procedure
140
 
 
127
  The `ul2-base-nl36-en-nl` T5 model was pre-trained simultaneously on a combination of several datasets,
128
  including the `full` config of the "mc4_nl_cleaned" dataset, which is a cleaned version of Common Crawl's web
129
  crawl corpus, Dutch books, the Dutch subset of Wikipedia (2022-03-20), and a subset of "mc4_nl_cleaned"
130
+ containing only texts from Dutch newspapers.
 
131
 
132
  After pre-training, the model was
133
  fine-tuned on a translation dataset containing 13 million sentence and paragraph pairs
134
  sampled from books.
 
 
135
 
136
  ## Training procedure
137