File size: 7,657 Bytes
7979f0a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
251f723
7979f0a
 
 
 
251f723
7979f0a
 
 
 
 
 
 
 
 
251f723
 
 
 
 
 
 
 
7979f0a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
251f723
 
 
 
 
 
 
7979f0a
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-sep_tok
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: essays_su_g
      type: essays_su_g
      config: sep_tok
      split: train[80%:100%]
      args: sep_tok
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.8854738259552264
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# longformer-sep_tok

This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2621
- Claim: {'precision': 0.5981237322515213, 'recall': 0.565978886756238, 'f1-score': 0.5816074950690335, 'support': 4168.0}
- Majorclaim: {'precision': 0.8415746519443111, 'recall': 0.8145910780669146, 'f1-score': 0.8278630460448643, 'support': 2152.0}
- O: {'precision': 0.9999115904871364, 'recall': 0.9998231966053748, 'f1-score': 0.9998673915926268, 'support': 11312.0}
- Premise: {'precision': 0.8798415137058301, 'recall': 0.9012672906485546, 'f1-score': 0.8904255319148937, 'support': 12073.0}
- Accuracy: 0.8855
- Macro avg: {'precision': 0.8298628720971998, 'recall': 0.8204151130192705, 'f1-score': 0.8249408661553546, 'support': 29705.0}
- Weighted avg: {'precision': 0.8832645976626653, 'recall': 0.8854738259552264, 'f1-score': 0.8842386364262106, 'support': 29705.0}

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5

### Training results

| Training Loss | Epoch | Step | Validation Loss | Claim                                                                                                                | Majorclaim                                                                                                         | O                                                                                                                   | Premise                                                                                                             | Accuracy | Macro avg                                                                                                           | Weighted avg                                                                                                        |
|:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log        | 1.0   | 41   | 0.3617          | {'precision': 0.4788679245283019, 'recall': 0.30446257197696736, 'f1-score': 0.37224992666471113, 'support': 4168.0} | {'precision': 0.6963803349540789, 'recall': 0.5989776951672863, 'f1-score': 0.6440169872595554, 'support': 2152.0} | {'precision': 0.9991923180472045, 'recall': 0.9842644978783592, 'f1-score': 0.9916722333556, 'support': 11312.0}    | {'precision': 0.8126733518241945, 'recall': 0.9464921726165825, 'f1-score': 0.8744929976276117, 'support': 12073.0} | 0.8456   | {'precision': 0.7467784823384449, 'recall': 0.7085492344097988, 'f1-score': 0.7206080362268695, 'support': 29705.0} | {'precision': 0.8284396858636128, 'recall': 0.8456152162935533, 'f1-score': 0.8319479048980907, 'support': 29705.0} |
| No log        | 2.0   | 82   | 0.2796          | {'precision': 0.5955649419218585, 'recall': 0.4059500959692898, 'f1-score': 0.4828078185190469, 'support': 4168.0}   | {'precision': 0.760759493670886, 'recall': 0.8378252788104089, 'f1-score': 0.7974347633790357, 'support': 2152.0}  | {'precision': 0.9999115357395613, 'recall': 0.9992043847241867, 'f1-score': 0.999557835160948, 'support': 11312.0}  | {'precision': 0.8505686125852919, 'recall': 0.9292636461525718, 'f1-score': 0.8881763844357361, 'support': 12073.0} | 0.8758   | {'precision': 0.8017011459793993, 'recall': 0.7930608514141144, 'f1-score': 0.7919942003736917, 'support': 29705.0} | {'precision': 0.8651534509455714, 'recall': 0.8758458172024912, 'f1-score': 0.8671393475513334, 'support': 29705.0} |
| No log        | 3.0   | 123  | 0.2584          | {'precision': 0.6091815161582603, 'recall': 0.48392514395393477, 'f1-score': 0.5393769220484022, 'support': 4168.0}  | {'precision': 0.7808161548169962, 'recall': 0.862453531598513, 'f1-score': 0.8196069772576728, 'support': 2152.0}  | {'precision': 0.9999115748518879, 'recall': 0.9996463932107497, 'f1-score': 0.9997789664471067, 'support': 11312.0} | {'precision': 0.8697670758577274, 'recall': 0.9155139567630249, 'f1-score': 0.892054396513458, 'support': 12073.0}  | 0.8832   | {'precision': 0.8149190804212179, 'recall': 0.8153847563815556, 'f1-score': 0.8127043155666599, 'support': 29705.0} | {'precision': 0.8763198978646256, 'recall': 0.8831509846827134, 'f1-score': 0.8783433638684701, 'support': 29705.0} |
| No log        | 4.0   | 164  | 0.2543          | {'precision': 0.5829736211031175, 'recall': 0.583253358925144, 'f1-score': 0.58311345646438, 'support': 4168.0}      | {'precision': 0.8634197988353626, 'recall': 0.7578996282527881, 'f1-score': 0.8072259341747091, 'support': 2152.0} | {'precision': 0.9999115904871364, 'recall': 0.9998231966053748, 'f1-score': 0.9998673915926268, 'support': 11312.0} | {'precision': 0.8795297932711795, 'recall': 0.89861674811563, 'f1-score': 0.8889708292363159, 'support': 12073.0}   | 0.8827   | {'precision': 0.831458700924199, 'recall': 0.8098982329747342, 'f1-score': 0.8197944028670079, 'support': 29705.0}  | {'precision': 0.8825947337352275, 'recall': 0.8827133479212254, 'f1-score': 0.8823636375005335, 'support': 29705.0} |
| No log        | 5.0   | 205  | 0.2621          | {'precision': 0.5981237322515213, 'recall': 0.565978886756238, 'f1-score': 0.5816074950690335, 'support': 4168.0}    | {'precision': 0.8415746519443111, 'recall': 0.8145910780669146, 'f1-score': 0.8278630460448643, 'support': 2152.0} | {'precision': 0.9999115904871364, 'recall': 0.9998231966053748, 'f1-score': 0.9998673915926268, 'support': 11312.0} | {'precision': 0.8798415137058301, 'recall': 0.9012672906485546, 'f1-score': 0.8904255319148937, 'support': 12073.0} | 0.8855   | {'precision': 0.8298628720971998, 'recall': 0.8204151130192705, 'f1-score': 0.8249408661553546, 'support': 29705.0} | {'precision': 0.8832645976626653, 'recall': 0.8854738259552264, 'f1-score': 0.8842386364262106, 'support': 29705.0} |


### Framework versions

- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2