amindada commited on
Commit
6162bec
1 Parent(s): 953d733

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -33,11 +33,11 @@ The pre-training dataset consists of documents from different domains:
33
  | Medical | Smaller public datasets | 253MB | 179,776 | 50M |
34
  | Medical | CC medical texts | 3.6GB | 2,000,000 | 682M |
35
  | Medical | Medical Dissertations | 1.4GB | 14,496 | 295M |
36
- | Medical | Pubmed abstracts | 8.5GB | 21,044,382 | 1.7B |
37
- | Medical | MIMIC III | 2.6GB | 24,221,834 | 695M |
38
- | Medical | PMC-Patients-ReCDS | 2.1GB | 1,743,344 | 414M |
39
  | Literature | German Fiction | 1.1GB | 3,219 | 243M |
40
- | Literature | English books | 7.1GB | 11,038 | 1.6B |
41
  | - | Total | 167GB | 116,079,769 | 35.8B |
42
 
43
 
 
33
  | Medical | Smaller public datasets | 253MB | 179,776 | 50M |
34
  | Medical | CC medical texts | 3.6GB | 2,000,000 | 682M |
35
  | Medical | Medical Dissertations | 1.4GB | 14,496 | 295M |
36
+ | Medical | Pubmed abstracts (translated | 8.5GB | 21,044,382 | 1.7B |
37
+ | Medical | MIMIC III (translated) | 2.6GB | 24,221,834 | 695M |
38
+ | Medical | PMC-Patients-ReCDS (translated | 2.1GB | 1,743,344 | 414M |
39
  | Literature | German Fiction | 1.1GB | 3,219 | 243M |
40
+ | Literature | English books (translated | 7.1GB | 11,038 | 1.6B |
41
  | - | Total | 167GB | 116,079,769 | 35.8B |
42
 
43