atzenhofer
commited on
Commit
•
83aac35
1
Parent(s):
eee9013
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,89 @@
|
|
1 |
---
|
2 |
license: gpl-3.0
|
3 |
language:
|
|
|
4 |
- de
|
5 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: gpl-3.0
|
3 |
language:
|
4 |
+
- gmh
|
5 |
- de
|
6 |
+
widget:
|
7 |
+
- text: >-
|
8 |
+
Wir Graf Hainreich von Schavnberg veriehen Offenlich an disem brief vnd , di
|
9 |
+
in sehent, Hornt oder lesent, Das fur vns vnd fur vnsern brueder chomen ist
|
10 |
+
der Erwirdig abpt datz wilhering vnd Her wernhart chelner da selben vnd der
|
11 |
+
Erwerig wolbeschaiden Peter vnd hat der selb Peter pope nach vnserm rat vnd
|
12 |
+
nach seiner besten vreunt rat vnd willen vnd wort aller seiner Erben dem vor
|
13 |
+
geschriben erwirdigen Herren abpt dvrch sein recht notdurft vnd durch sein
|
14 |
+
leibnar sein Hueb ze Strashaim , di Sein rechtz Erbaigen gewesen ist vnd di
|
15 |
+
in von seinen vodern also Herman vnd das Gotzhaus ze wilhering di
|
16 |
+
vorgenanten Hueb mit allen den rechten vnd gelegen sind, versuecht vnd
|
17 |
+
vnuersuecht, durch di lieb vnd durch ze wilhering den vorgeschriben peter
|
18 |
+
poppen begnat haben mit vntz an seinen tod in dem chloster vnd auch
|
19 |
+
Hausfrawen frawen Ofmein vnd Jansen seins Suns. wer auch, das iemand , So
|
20 |
+
geit der oft genant Peter Poppe Gotzhaus ze wilhering Sechtzich Phunt
|
21 |
+
wienner Phenning auf der oft des selben an aller der stat, ob der Ens, vnd
|
22 |
+
wort geschehen vnd vertaidingt an disem brief verschriben stet vnd lob
|
23 |
+
laisten, stat haben vnd ze volfuren geuarde vnd dar vber zu gib ich disen
|
24 |
+
brief dem oft geschriben Abbt Herman vnd dem Gotzhaus ze gnadigen Herren
|
25 |
+
Graf gundem Insigel vnd mit wernhart des Hager vnd der Alhartinger.
|
26 |
+
---
|
27 |
+
|
28 |
+
# XLM-RoBERTa (base) Middle High German Charter Masked Language Model
|
29 |
+
This model is a fine-tuned version of xlm-roberta-base on Middle High German (gmh; ISO 639-2; c. 1050–1500) charters of the [monasterium.net](https://www.icar-us.eu/en/cooperation/online-portals/monasterium-net/) data set.
|
30 |
+
|
31 |
+
## Model description
|
32 |
+
Please refer this model together with to the [XLM-RoBERTa (base-sized model)](https://huggingface.co/xlm-roberta-base) card or the paper [Unsupervised Cross-lingual Representation Learning at Scale by Conneau et al.](https://arxiv.org/abs/1911.02116) for additional information.
|
33 |
+
|
34 |
+
## Intended uses & limitations
|
35 |
+
This model can be used for sequence prediction tasks, i.e., fill-masks.
|
36 |
+
|
37 |
+
## Training and evaluation data
|
38 |
+
The model was fine-tuned using the Middle High German Monasterium charters.
|
39 |
+
It was trained on a Tesla V100-SXM2-16GB GPU.
|
40 |
+
|
41 |
+
## Training hyperparameters
|
42 |
+
The following hyperparameters were used during training:
|
43 |
+
- num_train_epochs: 15
|
44 |
+
- learning_rate: 2e-5
|
45 |
+
- weight-decay: 0,01
|
46 |
+
- train_batch_size: 16
|
47 |
+
- eval_batch_size: 16
|
48 |
+
- num_proc: 4
|
49 |
+
- block_size: 256
|
50 |
+
|
51 |
+
|
52 |
+
## Training results
|
53 |
+
|
54 |
+
| Epoch | Training Loss | Validation Loss |
|
55 |
+
|-------|---------------|------------------|
|
56 |
+
| 1 | 2.423800 | 2.025645 |
|
57 |
+
| 2 | 1.876500 | 1.700380 |
|
58 |
+
| 3 | 1.702100 | 1.565900 |
|
59 |
+
| 4 | 1.582400 | 1.461868 |
|
60 |
+
| 5 | 1.506000 | 1.393849 |
|
61 |
+
| 6 | 1.407300 | 1.359359 |
|
62 |
+
| 7 | 1.385400 | 1.317869 |
|
63 |
+
| 8 | 1.336700 | 1.285630 |
|
64 |
+
| 9 | 1.301300 | 1.246812 |
|
65 |
+
| 10 | 1.273500 | 1.219290 |
|
66 |
+
| 11 | 1.245600 | 1.198312 |
|
67 |
+
| 12 | 1.225800 | 1.198695 |
|
68 |
+
| 13 | 1.214100 | 1.194895 |
|
69 |
+
| 14 | 1.209500 | 1.177452 |
|
70 |
+
| 15 | 1.200300 | 1.177396 |
|
71 |
+
|
72 |
+
Perplexity: 3.25
|
73 |
+
|
74 |
+
## Updates
|
75 |
+
- 2023-03-30: Upload
|
76 |
+
|
77 |
+
|
78 |
+
## Citation
|
79 |
+
Please cite the following papers when using this model.
|
80 |
+
|
81 |
+
```
|
82 |
+
@misc{xlm-roberta-base-mhg-charter-mlm,
|
83 |
+
title={xlm-roberta-base-mhg-charter-mlm},
|
84 |
+
author={Atzenhofer-Baumgartner, Florian},
|
85 |
+
year = { 2023 },
|
86 |
+
url = { https://huggingface.co/atzenhofer/xlm-roberta-base-mhg-charter-mlm },
|
87 |
+
publisher = { Hugging Face }
|
88 |
+
}
|
89 |
+
```
|