metadata
license: mit
base_model: microsoft/deberta-v3-base
tags:
- generated_from_trainer
model-index:
- name: legal_deberta
results: []
legal_deberta
This model is a fine-tuned version of microsoft/deberta-v3-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.2674
- Law Precision: 0.6932
- Law Recall: 0.8133
- Law F1: 0.7485
- Law Number: 75
- Violated by Precision: 0.8684
- Violated by Recall: 0.88
- Violated by F1: 0.8742
- Violated by Number: 75
- Violated on Precision: 0.5882
- Violated on Recall: 0.6667
- Violated on F1: 0.625
- Violated on Number: 75
- Violation Precision: 0.5287
- Violation Recall: 0.6429
- Violation F1: 0.5802
- Violation Number: 616
- Overall Precision: 0.5741
- Overall Recall: 0.6813
- Overall F1: 0.6232
- Overall Accuracy: 0.9461
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Law Precision | Law Recall | Law F1 | Law Number | Violated by Precision | Violated by Recall | Violated by F1 | Violated by Number | Violated on Precision | Violated on Recall | Violated on F1 | Violated on Number | Violation Precision | Violation Recall | Violation F1 | Violation Number | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1.9748 | 1.0 | 45 | 1.1555 | 0.0 | 0.0 | 0.0 | 75 | 0.0 | 0.0 | 0.0 | 75 | 0.0 | 0.0 | 0.0 | 75 | 0.0 | 0.0 | 0.0 | 616 | 0.0 | 0.0 | 0.0 | 0.7437 |
0.4536 | 2.0 | 90 | 0.3670 | 0.0 | 0.0 | 0.0 | 75 | 0.0 | 0.0 | 0.0 | 75 | 0.0 | 0.0 | 0.0 | 75 | 0.1704 | 0.2955 | 0.2162 | 616 | 0.1704 | 0.2164 | 0.1907 | 0.8901 |
0.2704 | 3.0 | 135 | 0.2199 | 0.7059 | 0.64 | 0.6713 | 75 | 0.3095 | 0.1733 | 0.2222 | 75 | 0.0909 | 0.0133 | 0.0233 | 75 | 0.3291 | 0.5097 | 0.4000 | 616 | 0.3498 | 0.4471 | 0.3925 | 0.9277 |
0.1475 | 4.0 | 180 | 0.1959 | 0.6263 | 0.8267 | 0.7126 | 75 | 0.9153 | 0.72 | 0.8060 | 75 | 0.3182 | 0.3733 | 0.3436 | 75 | 0.4641 | 0.5974 | 0.5224 | 616 | 0.4928 | 0.6088 | 0.5447 | 0.9407 |
0.0879 | 5.0 | 225 | 0.2038 | 0.5909 | 0.8667 | 0.7027 | 75 | 0.7590 | 0.84 | 0.7975 | 75 | 0.3982 | 0.6 | 0.4787 | 75 | 0.4692 | 0.6055 | 0.5287 | 616 | 0.4959 | 0.6492 | 0.5623 | 0.9434 |
0.0499 | 6.0 | 270 | 0.2466 | 0.5913 | 0.9067 | 0.7158 | 75 | 0.7674 | 0.88 | 0.8199 | 75 | 0.4412 | 0.6 | 0.5085 | 75 | 0.4832 | 0.6071 | 0.5381 | 616 | 0.5135 | 0.6576 | 0.5766 | 0.9425 |
0.0291 | 7.0 | 315 | 0.2980 | 0.5755 | 0.8133 | 0.6740 | 75 | 0.7976 | 0.8933 | 0.8428 | 75 | 0.3802 | 0.6133 | 0.4694 | 75 | 0.4929 | 0.5617 | 0.5250 | 616 | 0.5133 | 0.6183 | 0.5609 | 0.9389 |
0.0341 | 8.0 | 360 | 0.2660 | 0.5739 | 0.88 | 0.6947 | 75 | 0.8193 | 0.9067 | 0.8608 | 75 | 0.48 | 0.64 | 0.5486 | 75 | 0.4800 | 0.6445 | 0.5502 | 616 | 0.5147 | 0.6885 | 0.5890 | 0.9366 |
0.0228 | 9.0 | 405 | 0.3186 | 0.3505 | 0.9067 | 0.5056 | 75 | 0.6126 | 0.9067 | 0.7312 | 75 | 0.3216 | 0.7333 | 0.4472 | 75 | 0.4365 | 0.5519 | 0.4875 | 616 | 0.4231 | 0.6314 | 0.5067 | 0.9301 |
0.0173 | 10.0 | 450 | 0.2674 | 0.6932 | 0.8133 | 0.7485 | 75 | 0.8684 | 0.88 | 0.8742 | 75 | 0.5882 | 0.6667 | 0.625 | 75 | 0.5287 | 0.6429 | 0.5802 | 616 | 0.5741 | 0.6813 | 0.6232 | 0.9461 |
Framework versions
- Transformers 4.31.0
- Pytorch 2.4.0+cu121
- Datasets 2.15.0
- Tokenizers 0.13.3