Philip May
commited on
Commit
•
051fdfe
1
Parent(s):
643acc2
Update README.md
Browse files
README.md
CHANGED
@@ -45,18 +45,18 @@ The MLSUM dataset has a special characteristic. In the text, the summary is ofte
|
|
45 |
|
46 |
This model is trained on the following datasets:
|
47 |
|
48 |
-
| Name | Language |
|
49 |
-
|
50 |
-
| [CNN Daily - Train](https://github.com/abisee/cnn-dailymail) | en |
|
51 |
-
| [Extreme Summarization (XSum) - Train](https://github.com/EdinburghNLP/XSum) | en |
|
52 |
-
| [MLSUM German - Train](https://github.com/ThomasScialom/MLSUM) | de |
|
53 |
-
| [SwissText 2019 - Train](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html) | de |
|
54 |
|
55 |
| Language | Size
|
56 |
|------|------
|
57 |
-
| German |
|
58 |
-
| English |
|
59 |
-
| Total |
|
60 |
|
61 |
## Evaluation on MLSUM German Test Set (no beams)
|
62 |
|
|
|
45 |
|
46 |
This model is trained on the following datasets:
|
47 |
|
48 |
+
| Name | Language | License
|
49 |
+
|------|----------|--------
|
50 |
+
| [CNN Daily - Train](https://github.com/abisee/cnn-dailymail) | en | The license is unclear. The data comes from CNN and Daily Mail. We assume that it may only be used for research purposes and not commercially.
|
51 |
+
| [Extreme Summarization (XSum) - Train](https://github.com/EdinburghNLP/XSum) | en | The license is unclear. The data comes from BBC. We assume that it may only be used for research purposes and not commercially.
|
52 |
+
| [MLSUM German - Train](https://github.com/ThomasScialom/MLSUM) | de | Usage of dataset is restricted to non-commercial research purposes only. Copyright belongs to the original copyright holders (see [here](https://github.com/ThomasScialom/MLSUM#mlsum)).
|
53 |
+
| [SwissText 2019 - Train](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html) | de | The license is unclear. The data was published in the [German Text Summarization Challenge](https://www.swisstext.org/2019/shared-task/german-text-summarization-challenge.html). We assume that they may be used for research purposes and not commercially.
|
54 |
|
55 |
| Language | Size
|
56 |
|------|------
|
57 |
+
| German | 302,607
|
58 |
+
| English | 422,228
|
59 |
+
| Total | 724,835
|
60 |
|
61 |
## Evaluation on MLSUM German Test Set (no beams)
|
62 |
|