bert-german-ler
Model description
This model is a fine-tuned version of bert-base-german-cased on the German LER Dataset.
Distribution of classes in the dataset:
Fine-grained classes | # | % | ||
---|---|---|---|---|
1 | PER | Person | 1,747 | 3.26 |
2 | RR | Judge | 1,519 | 2.83 |
3 | AN | Lawyer | 111 | 0.21 |
4 | LD | Country | 1,429 | 2.66 |
5 | ST | City | 705 | 1.31 |
6 | STR | Street | 136 | 0.25 |
7 | LDS | Landscape | 198 | 0.37 |
8 | ORG | Organization | 1,166 | 2.17 |
9 | UN | Company | 1,058 | 1.97 |
10 | INN | Institution | 2,196 | 4.09 |
11 | GRT | Court | 3,212 | 5.99 |
12 | MRK | Brand | 283 | 0.53 |
13 | GS | Law | 18,52 | 34.53 |
14 | VO | Ordinance | 797 | 1.49 |
15 | EUN | European legal norm | 1,499 | 2.79 |
16 | VS | Regulation | 607 | 1.13 |
17 | VT | Contract | 2,863 | 5.34 |
18 | RS | Court decision | 12,58 | 23.46 |
19 | LIT | Legal literature | 3,006 | 5.60 |
Total | 53,632 | 100 |
How to fine-tune another model on the German LER Dataset, see GitHub.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 12
- eval_batch_size: 16
- max_seq_length: 512
- num_epochs: 3
Results
Results on the dev set:
precision recall f1-score support
AN 0.75 0.50 0.60 12
EUN 0.92 0.93 0.92 116
GRT 0.95 0.99 0.97 331
GS 0.98 0.98 0.98 1720
INN 0.84 0.91 0.88 199
LD 0.95 0.95 0.95 109
LDS 0.82 0.43 0.56 21
LIT 0.88 0.92 0.90 231
MRK 0.50 0.70 0.58 23
ORG 0.64 0.71 0.67 103
PER 0.86 0.93 0.90 186
RR 0.97 0.98 0.97 144
RS 0.94 0.95 0.94 1126
ST 0.91 0.88 0.89 58
STR 0.29 0.29 0.29 7
UN 0.81 0.85 0.83 143
VO 0.76 0.95 0.84 37
VS 0.62 0.80 0.70 56
VT 0.87 0.92 0.90 275
micro avg 0.92 0.94 0.93 4897
macro avg 0.80 0.82 0.80 4897
weighted avg 0.92 0.94 0.93 4897
Results on the test set:
precision recall f1-score support
AN 1.00 0.89 0.94 9
EUN 0.90 0.97 0.93 150
GRT 0.98 0.98 0.98 321
GS 0.98 0.99 0.98 1818
INN 0.90 0.95 0.92 222
LD 0.97 0.92 0.94 149
LDS 0.91 0.45 0.61 22
LIT 0.92 0.96 0.94 314
MRK 0.78 0.88 0.82 32
ORG 0.82 0.88 0.85 113
PER 0.92 0.88 0.90 173
RR 0.95 0.99 0.97 142
RS 0.97 0.98 0.97 1245
ST 0.79 0.86 0.82 64
STR 0.75 0.80 0.77 15
UN 0.90 0.95 0.93 108
VO 0.80 0.83 0.81 71
VS 0.73 0.84 0.78 64
VT 0.93 0.97 0.95 290
micro avg 0.94 0.96 0.95 5322
macro avg 0.89 0.89 0.89 5322
weighted avg 0.95 0.96 0.95 5322
Reference
@misc{https://doi.org/10.48550/arxiv.2003.13016,
doi = {10.48550/ARXIV.2003.13016},
url = {https://arxiv.org/abs/2003.13016},
author = {Leitner, Elena and Rehm, Georg and Moreno-Schneider, Julián},
keywords = {Computation and Language (cs.CL), Information Retrieval (cs.IR), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {A Dataset of German Legal Documents for Named Entity Recognition},
publisher = {arXiv},
year = {2020},
copyright = {arXiv.org perpetual, non-exclusive license}
}
- Downloads last month
- 67
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for elenanereiss/bert-german-ler
Base model
google-bert/bert-base-german-casedDataset used to train elenanereiss/bert-german-ler
Evaluation results
- F1 on elenanereiss/german-lerself-reported0.955
- Precision on elenanereiss/german-lerself-reported0.945
- Recall on elenanereiss/german-lerself-reported0.964