Historisches Grundbuch der Stadt Basel Nested NER
A model for historical German developed by Ismail Prada Ziegler as part of the projekt Economies of Space. Practices, Discourses and Actors on the Basel Real Estate Market (1400-1700) at the University of Basel in cooperation with the Digital Humanities Bern. This Model was created to annotate nested document structures. It can be used to annotate flat text (such as in the example), but may perform slightly worse than models trained only for that task. You can annotate nested tags by using this script. You can find more info on this model here.
Performance
When annotating recursively:
PER | ORG | LOC | |
---|---|---|---|
Precision | 86.30% | 82.69% | 82.79% |
Recall | 85.82% | 74.14% | 78.46% |
F1-Score | 86.06% | 78.18% | 80.57% |
Dataset
Not yet published dataset created from the Historical Land Registry of the city of Basel. Timeframe: 1400-1700. Language: Early New High German. 661 documents in train, 83 in dev. Language model based on the full HLRB corpus until 1800, appr. 120k documents.
The documents were annotated according to the BeNASch annotation guidelines. For this model, a simplified tagset was used.
The training data was prepared in a special way to accommodate nested annotation. See the linked paper for more information.
Citation
If you publish works using this model, please cite:
Prada Ziegler, I. (2024, May 30). What's in an entity? Exploring Nested Named Entity Recognition in the Historical Land Register of Basel (1400-1700). DH Benelux 2024, Leuven, Belgium. Zenodo. https://doi.org/10.5281/zenodo.11394453
- Downloads last month
- 6