File size: 8,184 Bytes
283e33f
917de15
283e33f
 
 
 
 
 
3fe84a2
 
 
 
 
 
 
 
 
 
 
283e33f
 
 
 
 
dcffc67
 
 
 
283e33f
 
3fe84a2
283e33f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
917de15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
---
license: cc-by-nc-4.0
base_model: distilbert-base-german-cased
tags:
- generated_from_trainer
model-index:
- name: distilbert-base-german-cased_finetuned_ai4privacy_v2
  results: []
datasets:
- ai4privacy/pii-masking-200k
- Isotonic/pii-masking-200k
language:
- de
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: token-classification
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

🌟 Buying me coffee is a direct way to show support for this project. 
<a href="https://www.buymeacoffee.com/isotonic"><img src="https://www.buymeacoffee.com/assets/img/guidelines/download-assets-sm-1.svg" alt=""></a>


# distilbert-base-german-cased_finetuned_ai4privacy_v2

This model is a fine-tuned version of [distilbert-base-german-cased](https://huggingface.co/distilbert-base-german-cased) on the German subset of [pii-masking-200k](https://huggingface.co/ai4privacy/pii-masking-200k) dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0821
- Overall Precision: 0.9086
- Overall Recall: 0.9379
- Overall F1: 0.9230
- Overall Accuracy: 0.9679

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy | Accountname F1 | Accountnumber F1 | Age F1 | Amount F1 | Bic F1 | Bitcoinaddress F1 | Buildingnumber F1 | City F1 | Companyname F1 | County F1 | Creditcardcvv F1 | Creditcardissuer F1 | Creditcardnumber F1 | Currency F1 | Currencycode F1 | Currencyname F1 | Currencysymbol F1 | Date F1 | Dob F1 | Email F1 | Ethereumaddress F1 | Eyecolor F1 | Firstname F1 | Gender F1 | Height F1 | Iban F1 | Ip F1  | Ipv4 F1 | Ipv6 F1 | Jobarea F1 | Jobtitle F1 | Jobtype F1 | Lastname F1 | Litecoinaddress F1 | Mac F1 | Maskednumber F1 | Middlename F1 | Nearbygpscoordinate F1 | Ordinaldirection F1 | Password F1 | Phoneimei F1 | Phonenumber F1 | Pin F1 | Prefix F1 | Secondaryaddress F1 | Sex F1 | Ssn F1 | State F1 | Street F1 | Time F1 | Url F1 | Useragent F1 | Username F1 | Vehiclevin F1 | Vehiclevrm F1 | Zipcode F1 |
|:-------------:|:-----:|:-----:|:---------------:|:-----------------:|:--------------:|:----------:|:----------------:|:--------------:|:----------------:|:------:|:---------:|:------:|:-----------------:|:-----------------:|:-------:|:--------------:|:---------:|:----------------:|:-------------------:|:-------------------:|:-----------:|:---------------:|:---------------:|:-----------------:|:-------:|:------:|:--------:|:------------------:|:-----------:|:------------:|:---------:|:---------:|:-------:|:------:|:-------:|:-------:|:----------:|:-----------:|:----------:|:-----------:|:------------------:|:------:|:---------------:|:-------------:|:----------------------:|:-------------------:|:-----------:|:------------:|:--------------:|:------:|:---------:|:-------------------:|:------:|:------:|:--------:|:---------:|:-------:|:------:|:------------:|:-----------:|:-------------:|:-------------:|:----------:|
| 0.1449        | 1.0   | 5282  | 0.1365          | 0.8213            | 0.8741         | 0.8469     | 0.9504           | 0.9954         | 0.9180           | 0.9509 | 0.7478    | 0.8315 | 0.8265            | 0.7908            | 0.8030  | 0.9011         | 0.9118    | 0.8669           | 0.9831              | 0.8053              | 0.4935      | 0.6482          | 0.0             | 0.8430            | 0.7672  | 0.4751 | 0.9870   | 0.9103             | 0.9501      | 0.8810       | 0.9552    | 0.9507    | 0.9086  | 0.0    | 0.8124  | 0.7776  | 0.8698     | 0.9758      | 0.9445     | 0.8140      | 0.5210             | 0.9819 | 0.6555          | 0.4114        | 1.0                    | 0.9837              | 0.8093      | 0.9761       | 0.9254         | 0.7705 | 0.8613    | 0.9676              | 0.9978 | 0.9570 | 0.8585   | 0.8164    | 0.9643  | 0.9879 | 0.9534       | 0.9415      | 0.8778        | 0.9716        | 0.7313     |
| 0.1039        | 2.0   | 10564 | 0.0841          | 0.8875            | 0.9213         | 0.9041     | 0.9649           | 0.9923         | 0.9598           | 0.9721 | 0.8979    | 0.9240 | 0.9218            | 0.8937            | 0.8803  | 0.9648         | 0.9595    | 0.9563           | 0.9848              | 0.8427              | 0.5724      | 0.7677          | 0.2210          | 0.9244            | 0.8003  | 0.5866 | 0.9932   | 0.9636             | 0.9835      | 0.9473       | 0.9794    | 0.9753    | 0.9644  | 0.0173 | 0.7042  | 0.7564  | 0.9439     | 0.9911      | 0.9710     | 0.8988      | 0.7288             | 0.9801 | 0.7913          | 0.8977        | 0.9978                 | 0.9853              | 0.9581      | 0.9937       | 0.9761         | 0.9146 | 0.9166    | 0.9741              | 0.9978 | 0.9787 | 0.9448   | 0.9031    | 0.9591  | 0.9968 | 0.9638       | 0.9719      | 0.9455        | 0.9829        | 0.8863     |
| 0.0804        | 3.0   | 15846 | 0.0821          | 0.9086            | 0.9379         | 0.9230     | 0.9679           | 0.9985         | 0.9849           | 0.9792 | 0.9387    | 0.9641 | 0.9637            | 0.9011            | 0.9260  | 0.9782         | 0.9778    | 0.9543           | 1.0                 | 0.8796              | 0.7027      | 0.8328          | 0.3466          | 0.9420            | 0.8156  | 0.6575 | 0.9971   | 0.9947             | 0.9833      | 0.9614       | 0.9881    | 0.9842    | 0.9819  | 0.2023 | 0.6631  | 0.7243  | 0.9722     | 0.9904      | 0.9725     | 0.9185      | 0.8545             | 0.9780 | 0.8365          | 0.9156        | 1.0                    | 0.9853              | 0.9782      | 0.9947       | 0.9883         | 0.9189 | 0.9594    | 0.9831              | 0.9993 | 0.9898 | 0.9739   | 0.9355    | 0.9764  | 0.9984 | 0.9885       | 0.9798      | 0.9614        | 1.0           | 0.9100     |
| 0.0622        | 4.0   | 21128 | 0.0848          | 0.9095            | 0.9420         | 0.9255     | 0.9713           | 0.9977         | 0.9932           | 0.9815 | 0.9566    | 0.9550 | 0.9704            | 0.9187            | 0.9277  | 0.9735         | 0.9756    | 0.9679           | 0.9966              | 0.8885              | 0.6985      | 0.8598          | 0.4217          | 0.9602            | 0.8262  | 0.6809 | 0.9960   | 0.9947             | 0.9852      | 0.9641       | 0.9952    | 0.9955    | 0.9909  | 0.3053 | 0.7067  | 0.6156  | 0.9784     | 0.9948      | 0.9773     | 0.9176      | 0.8856             | 0.9880 | 0.8598          | 0.9186        | 1.0                    | 0.9886              | 0.9871      | 0.9968       | 0.9916         | 0.9419 | 0.9621    | 0.9887              | 1.0    | 0.9926 | 0.9717   | 0.9441    | 0.9835  | 0.9992 | 0.9858       | 0.9838      | 0.9818        | 0.9856        | 0.8972     |
| 0.032         | 5.0   | 26410 | 0.0998          | 0.9210            | 0.9497         | 0.9351     | 0.9741           | 0.9985         | 0.9962           | 0.9847 | 0.9622    | 0.9614 | 0.9738            | 0.9269            | 0.9431  | 0.9782         | 0.9749    | 0.9708           | 0.9949              | 0.8990              | 0.7116      | 0.8447          | 0.4615          | 0.9646            | 0.8296  | 0.7235 | 0.9966   | 0.9947             | 0.9853      | 0.9672       | 0.9929    | 0.9932    | 0.9919  | 0.3706 | 0.7690  | 0.6836  | 0.9838     | 0.9941      | 0.9789     | 0.9252      | 0.8876             | 0.9960 | 0.8849          | 0.9172        | 1.0                    | 0.9886              | 0.9847      | 0.9958       | 0.9925         | 0.9483 | 0.9700    | 0.9912              | 1.0    | 0.9944 | 0.9756   | 0.9468    | 0.99    | 0.9984 | 0.9947       | 0.9806      | 0.9939        | 1.0           | 0.9108     |


### Framework versions

- Transformers 4.35.2
- Pytorch 2.1.0+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0