atzenhofer commited on
Commit
83aac35
1 Parent(s): eee9013

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +85 -1
README.md CHANGED
@@ -1,5 +1,89 @@
1
  ---
2
  license: gpl-3.0
3
  language:
 
4
  - de
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: gpl-3.0
3
  language:
4
+ - gmh
5
  - de
6
+ widget:
7
+ - text: >-
8
+ Wir Graf Hainreich von Schavnberg veriehen Offenlich an disem brief vnd , di
9
+ in sehent, Hornt oder lesent, Das fur vns vnd fur vnsern brueder chomen ist
10
+ der Erwirdig abpt datz wilhering vnd Her wernhart chelner da selben vnd der
11
+ Erwerig wolbeschaiden Peter vnd hat der selb Peter pope nach vnserm rat vnd
12
+ nach seiner besten vreunt rat vnd willen vnd wort aller seiner Erben dem vor
13
+ geschriben erwirdigen Herren abpt dvrch sein recht notdurft vnd durch sein
14
+ leibnar sein Hueb ze Strashaim , di Sein rechtz Erbaigen gewesen ist vnd di
15
+ in von seinen vodern also Herman vnd das Gotzhaus ze wilhering di
16
+ vorgenanten Hueb mit allen den rechten vnd gelegen sind, versuecht vnd
17
+ vnuersuecht, durch di lieb vnd durch ze wilhering den vorgeschriben peter
18
+ poppen begnat haben mit vntz an seinen tod in dem chloster vnd auch
19
+ Hausfrawen frawen Ofmein vnd Jansen seins Suns. wer auch, das iemand , So
20
+ geit der oft genant Peter Poppe Gotzhaus ze wilhering Sechtzich Phunt
21
+ wienner Phenning auf der oft des selben an aller der stat, ob der Ens, vnd
22
+ wort geschehen vnd vertaidingt an disem brief verschriben stet vnd lob
23
+ laisten, stat haben vnd ze volfuren geuarde vnd dar vber zu gib ich disen
24
+ brief dem oft geschriben Abbt Herman vnd dem Gotzhaus ze gnadigen Herren
25
+ Graf gundem Insigel vnd mit wernhart des Hager vnd der Alhartinger.
26
+ ---
27
+
28
+ # XLM-RoBERTa (base) Middle High German Charter Masked Language Model
29
+ This model is a fine-tuned version of xlm-roberta-base on Middle High German (gmh; ISO 639-2; c. 1050–1500) charters of the [monasterium.net](https://www.icar-us.eu/en/cooperation/online-portals/monasterium-net/) data set.
30
+
31
+ ## Model description
32
+ Please refer this model together with to the [XLM-RoBERTa (base-sized model)](https://huggingface.co/xlm-roberta-base) card or the paper [Unsupervised Cross-lingual Representation Learning at Scale by Conneau et al.](https://arxiv.org/abs/1911.02116) for additional information.
33
+
34
+ ## Intended uses & limitations
35
+ This model can be used for sequence prediction tasks, i.e., fill-masks.
36
+
37
+ ## Training and evaluation data
38
+ The model was fine-tuned using the Middle High German Monasterium charters.
39
+ It was trained on a Tesla V100-SXM2-16GB GPU.
40
+
41
+ ## Training hyperparameters
42
+ The following hyperparameters were used during training:
43
+ - num_train_epochs: 15
44
+ - learning_rate: 2e-5
45
+ - weight-decay: 0,01
46
+ - train_batch_size: 16
47
+ - eval_batch_size: 16
48
+ - num_proc: 4
49
+ - block_size: 256
50
+
51
+
52
+ ## Training results
53
+
54
+ | Epoch | Training Loss | Validation Loss |
55
+ |-------|---------------|------------------|
56
+ | 1 | 2.423800 | 2.025645 |
57
+ | 2 | 1.876500 | 1.700380 |
58
+ | 3 | 1.702100 | 1.565900 |
59
+ | 4 | 1.582400 | 1.461868 |
60
+ | 5 | 1.506000 | 1.393849 |
61
+ | 6 | 1.407300 | 1.359359 |
62
+ | 7 | 1.385400 | 1.317869 |
63
+ | 8 | 1.336700 | 1.285630 |
64
+ | 9 | 1.301300 | 1.246812 |
65
+ | 10 | 1.273500 | 1.219290 |
66
+ | 11 | 1.245600 | 1.198312 |
67
+ | 12 | 1.225800 | 1.198695 |
68
+ | 13 | 1.214100 | 1.194895 |
69
+ | 14 | 1.209500 | 1.177452 |
70
+ | 15 | 1.200300 | 1.177396 |
71
+
72
+ Perplexity: 3.25
73
+
74
+ ## Updates
75
+ - 2023-03-30: Upload
76
+
77
+
78
+ ## Citation
79
+ Please cite the following papers when using this model.
80
+
81
+ ```
82
+ @misc{xlm-roberta-base-mhg-charter-mlm,
83
+ title={xlm-roberta-base-mhg-charter-mlm},
84
+ author={Atzenhofer-Baumgartner, Florian},
85
+ year = { 2023 },
86
+ url = { https://huggingface.co/atzenhofer/xlm-roberta-base-mhg-charter-mlm },
87
+ publisher = { Hugging Face }
88
+ }
89
+ ```