bert-mapa-german
This model is a fine-tuned version of google-bert/bert-base-german-cased on the MAPA german dataset. It's purpose is to discern private information within German texts.
It achieves the following results on the test set:
Category | Precision | Recall | F1 | Number |
---|---|---|---|---|
Address | 0.5882 | 0.6667 | 0.625 | 15 |
Age | 0.0 | 0.0 | 0.0 | 3 |
Amount | 1.0 | 1.0 | 1.0 | 1 |
Date | 0.9455 | 0.9455 | 0.9455 | 55 |
Name | 0.7 | 0.9545 | 0.8077 | 22 |
Organisation | 0.5405 | 0.6452 | 0.5882 | 31 |
Person | 0.5385 | 0.5 | 0.5185 | 14 |
Role | 0.0 | 0.0 | 0.0 | 1 |
Overall | 0.7255 | 0.7817 | 0.7525 |
- Loss: 0.0325
- Overall Accuracy: 0.9912
Intended uses & limitations
This model is engineered for the purpose of discerning private information within German texts. Its training corpus comprises only 1744 example sentences, thereby leading to a higher frequency of errors in its predictions.
Training and evaluation data
Random split of the MAPA german dataset into 80% train, 10% valdiation and 10% test.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
Training results
Training Loss | Epoch | Step | Validation Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy |
---|---|---|---|---|---|---|---|
No log | 1.0 | 218 | 0.0607 | 0.6527 | 0.7786 | 0.7101 | 0.9859 |
No log | 2.0 | 436 | 0.0479 | 0.7355 | 0.8143 | 0.7729 | 0.9896 |
0.116 | 3.0 | 654 | 0.0414 | 0.7712 | 0.8429 | 0.8055 | 0.9908 |
0.116 | 4.0 | 872 | 0.0421 | 0.7857 | 0.8643 | 0.8231 | 0.9917 |
Framework versions
- Transformers 4.40.0
- Pytorch 2.1.0+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for jbroermann/bert-mapa-german
Base model
google-bert/bert-base-german-cased