Philip May commited on
Commit
051fdfe
1 Parent(s): 643acc2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -45,18 +45,18 @@ The MLSUM dataset has a special characteristic. In the text, the summary is ofte
45
 
46
  This model is trained on the following datasets:
47
 
48
- | Name | Language | Size | License
49
- |------|----------|------|--------
50
- | [CNN Daily - Train](https://github.com/abisee/cnn-dailymail) | en | 218,223 | The license is unclear. The data comes from CNN and Daily Mail. We assume that it may only be used for research purposes and not commercially.
51
- | [Extreme Summarization (XSum) - Train](https://github.com/EdinburghNLP/XSum) | en | 204,005 | The license is unclear. The data comes from BBC. We assume that it may only be used for research purposes and not commercially.
52
- | [MLSUM German - Train](https://github.com/ThomasScialom/MLSUM) | de | 218,043 | Usage of dataset is restricted to non-commercial research purposes only. Copyright belongs to the original copyright holders (see [here](https://github.com/ThomasScialom/MLSUM#mlsum)).
53
- | [SwissText 2019 - Train](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html) | de | 84,564 | The license is unclear. The data was published in the [German Text Summarization Challenge](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html). We assume that they may be used for research purposes and not commercially.
54
 
55
  | Language | Size
56
  |------|------
57
- | German | xxx
58
- | English | xxx
59
- | Total | xxx
60
 
61
  ## Evaluation on MLSUM German Test Set (no beams)
62
 
 
45
 
46
  This model is trained on the following datasets:
47
 
48
+ | Name | Language | License
49
+ |------|----------|--------
50
+ | [CNN Daily - Train](https://github.com/abisee/cnn-dailymail) | en | The license is unclear. The data comes from CNN and Daily Mail. We assume that it may only be used for research purposes and not commercially.
51
+ | [Extreme Summarization (XSum) - Train](https://github.com/EdinburghNLP/XSum) | en | The license is unclear. The data comes from BBC. We assume that it may only be used for research purposes and not commercially.
52
+ | [MLSUM German - Train](https://github.com/ThomasScialom/MLSUM) | de | Usage of dataset is restricted to non-commercial research purposes only. Copyright belongs to the original copyright holders (see [here](https://github.com/ThomasScialom/MLSUM#mlsum)).
53
+ | [SwissText 2019 - Train](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html) | de | The license is unclear. The data was published in the [German Text Summarization Challenge](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html). We assume that they may be used for research purposes and not commercially.
54
 
55
  | Language | Size
56
  |------|------
57
+ | German | 302,607
58
+ | English | 422,228
59
+ | Total | 724,835
60
 
61
  ## Evaluation on MLSUM German Test Set (no beams)
62