File size: 13,053 Bytes
251f723
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-sep_tok
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: essays_su_g
      type: essays_su_g
      config: sep_tok
      split: train[80%:100%]
      args: sep_tok
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.8943612186500589
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# longformer-sep_tok

This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3879
- Claim: {'precision': 0.6307230422817113, 'recall': 0.6048464491362764, 'f1-score': 0.6175137783221066, 'support': 4168.0}
- Majorclaim: {'precision': 0.8988711194731891, 'recall': 0.8880111524163569, 'f1-score': 0.8934081346423562, 'support': 2152.0}
- O: {'precision': 0.9999116061168567, 'recall': 1.0, 'f1-score': 0.9999558011049724, 'support': 11312.0}
- Premise: {'precision': 0.8821419838617655, 'recall': 0.8964631823076286, 'f1-score': 0.8892449264645469, 'support': 12073.0}
- Accuracy: 0.8944
- Macro avg: {'precision': 0.8529119379333807, 'recall': 0.8473301959650654, 'f1-score': 0.8500306601334956, 'support': 29705.0}
- Weighted avg: {'precision': 0.892924576633343, 'recall': 0.8943612186500589, 'f1-score': 0.8935790524525439, 'support': 29705.0}

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 12

### Training results

| Training Loss | Epoch | Step | Validation Loss | Claim                                                                                                                | Majorclaim                                                                                                         | O                                                                                                                   | Premise                                                                                                             | Accuracy | Macro avg                                                                                                           | Weighted avg                                                                                                        |
|:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log        | 1.0   | 41   | 0.3346          | {'precision': 0.5123664980326026, 'recall': 0.43738003838771594, 'f1-score': 0.47191302096815946, 'support': 4168.0} | {'precision': 0.7102300538423887, 'recall': 0.6742565055762082, 'f1-score': 0.6917759237187128, 'support': 2152.0} | {'precision': 0.9996437160416852, 'recall': 0.9921322489391796, 'f1-score': 0.9958738187142286, 'support': 11312.0} | {'precision': 0.8501203696513163, 'recall': 0.9067340346227118, 'f1-score': 0.8775150300601202, 'support': 12073.0} | 0.8566   | {'precision': 0.7680901593919982, 'recall': 0.7526257068814539, 'f1-score': 0.7592694483653053, 'support': 29705.0} | {'precision': 0.8495348115917386, 'recall': 0.8565561353307524, 'f1-score': 0.8522201263911512, 'support': 29705.0} |
| No log        | 2.0   | 82   | 0.2693          | {'precision': 0.6444533120510774, 'recall': 0.3874760076775432, 'f1-score': 0.4839676356008391, 'support': 4168.0}   | {'precision': 0.7956173619890434, 'recall': 0.8773234200743495, 'f1-score': 0.834475138121547, 'support': 2152.0}  | {'precision': 0.9998227107525929, 'recall': 0.9970827439886846, 'f1-score': 0.9984508476076661, 'support': 11312.0} | {'precision': 0.8414913252122554, 'recall': 0.9440901184461195, 'f1-score': 0.8898430790850183, 'support': 12073.0} | 0.8813   | {'precision': 0.8203461775012423, 'recall': 0.8014930725466742, 'f1-score': 0.8016841751037678, 'support': 29705.0} | {'precision': 0.8708153253980879, 'recall': 0.8813331089042249, 'f1-score': 0.8702413426814749, 'support': 29705.0} |
| No log        | 3.0   | 123  | 0.2430          | {'precision': 0.6065792398310735, 'recall': 0.6547504798464492, 'f1-score': 0.6297450098073151, 'support': 4168.0}   | {'precision': 0.8615457562825984, 'recall': 0.8443308550185874, 'f1-score': 0.8528514433231636, 'support': 2152.0} | {'precision': 0.9999115435647944, 'recall': 0.9992927864214993, 'f1-score': 0.9996020692399523, 'support': 11312.0} | {'precision': 0.9014586160108549, 'recall': 0.8804770976559264, 'f1-score': 0.8908443327047978, 'support': 12073.0} | 0.8914   | {'precision': 0.8423737889223304, 'recall': 0.8447128047356155, 'f1-score': 0.8432607137688072, 'support': 29705.0} | {'precision': 0.8946836556485465, 'recall': 0.8914324187847164, 'f1-score': 0.8928724370609561, 'support': 29705.0} |
| No log        | 4.0   | 164  | 0.2497          | {'precision': 0.6372872745745491, 'recall': 0.6019673704414588, 'f1-score': 0.6191239975323873, 'support': 4168.0}   | {'precision': 0.8602197802197802, 'recall': 0.9093866171003717, 'f1-score': 0.8841201716738197, 'support': 2152.0} | {'precision': 0.9999116061168567, 'recall': 1.0, 'f1-score': 0.9999558011049724, 'support': 11312.0}                | {'precision': 0.8892446633825944, 'recall': 0.8971258179408598, 'f1-score': 0.8931678555230281, 'support': 12073.0} | 0.8958   | {'precision': 0.8466658310734451, 'recall': 0.8521199513706725, 'f1-score': 0.8490919564585518, 'support': 29705.0} | {'precision': 0.8939322416048354, 'recall': 0.8957751220333278, 'f1-score': 0.8947265097790277, 'support': 29705.0} |
| No log        | 5.0   | 205  | 0.2560          | {'precision': 0.6361584754262788, 'recall': 0.6086852207293666, 'f1-score': 0.6221186856302108, 'support': 4168.0}   | {'precision': 0.889348025711662, 'recall': 0.900092936802974, 'f1-score': 0.894688221709007, 'support': 2152.0}    | {'precision': 1.0, 'recall': 0.999557991513437, 'f1-score': 0.9997789469030461, 'support': 11312.0}                 | {'precision': 0.8856278613472858, 'recall': 0.8972914768491675, 'f1-score': 0.8914215182061304, 'support': 12073.0} | 0.8959   | {'precision': 0.8527835906213067, 'recall': 0.8514069064737362, 'f1-score': 0.8520018431120986, 'support': 29705.0} | {'precision': 0.8944477578506652, 'recall': 0.8959434438646693, 'f1-score': 0.895135201868183, 'support': 29705.0}  |
| No log        | 6.0   | 246  | 0.2836          | {'precision': 0.6055871212121212, 'recall': 0.6137236084452975, 'f1-score': 0.6096282173498571, 'support': 4168.0}   | {'precision': 0.9181011997913406, 'recall': 0.8178438661710037, 'f1-score': 0.8650774145981813, 'support': 2152.0} | {'precision': 0.9999115748518879, 'recall': 0.9996463932107497, 'f1-score': 0.9997789664471067, 'support': 11312.0} | {'precision': 0.8807833537331702, 'recall': 0.8940611281371655, 'f1-score': 0.8873725748109175, 'support': 12073.0} | 0.8894   | {'precision': 0.85109581239713, 'recall': 0.8313187489910541, 'f1-score': 0.8404642933015157, 'support': 29705.0}   | {'precision': 0.8902386153007307, 'recall': 0.8894125568086181, 'f1-score': 0.8895918454896943, 'support': 29705.0} |
| No log        | 7.0   | 287  | 0.3062          | {'precision': 0.5987467588591184, 'recall': 0.664827255278311, 'f1-score': 0.6300591177808095, 'support': 4168.0}    | {'precision': 0.9013605442176871, 'recall': 0.8619888475836431, 'f1-score': 0.8812351543942992, 'support': 2152.0} | {'precision': 1.0, 'recall': 1.0, 'f1-score': 1.0, 'support': 11312.0}                                              | {'precision': 0.8968992910224652, 'recall': 0.8697092686159198, 'f1-score': 0.8830950378469301, 'support': 12073.0} | 0.8900   | {'precision': 0.8492516485248176, 'recall': 0.8491313428694685, 'f1-score': 0.8485973275055096, 'support': 29705.0} | {'precision': 0.8946497061974581, 'recall': 0.8900185154014476, 'f1-score': 0.8919747802421455, 'support': 29705.0} |
| No log        | 8.0   | 328  | 0.3382          | {'precision': 0.6177335444469577, 'recall': 0.6552303262955854, 'f1-score': 0.6359296774944697, 'support': 4168.0}   | {'precision': 0.8736559139784946, 'recall': 0.9061338289962825, 'f1-score': 0.8895985401459854, 'support': 2152.0} | {'precision': 0.9999115983026874, 'recall': 0.9999115983026874, 'f1-score': 0.9999115983026874, 'support': 11312.0} | {'precision': 0.9003407155025553, 'recall': 0.8755073304066926, 'f1-score': 0.8877503884432872, 'support': 12073.0} | 0.8942   | {'precision': 0.8479104430576737, 'recall': 0.859195771000312, 'f1-score': 0.8532975510966074, 'support': 29705.0}  | {'precision': 0.8966717521763674, 'recall': 0.8941928968187174, 'f1-score': 0.8952627973023706, 'support': 29705.0} |
| No log        | 9.0   | 369  | 0.3559          | {'precision': 0.6090930396062808, 'recall': 0.6235604606525912, 'f1-score': 0.6162418494368701, 'support': 4168.0}   | {'precision': 0.9208074534161491, 'recall': 0.8266728624535316, 'f1-score': 0.8712047012732617, 'support': 2152.0} | {'precision': 0.9999115983026874, 'recall': 0.9999115983026874, 'f1-score': 0.9999115983026874, 'support': 11312.0} | {'precision': 0.883467278989667, 'recall': 0.8923217095999337, 'f1-score': 0.8878724193348992, 'support': 12073.0}  | 0.8908   | {'precision': 0.8533198425786961, 'recall': 0.835616657752186, 'f1-score': 0.8438076420869296, 'support': 29705.0}  | {'precision': 0.8920174343737682, 'recall': 0.8908264601918869, 'f1-score': 0.8912173797079, 'support': 29705.0}    |
| No log        | 10.0  | 410  | 0.3689          | {'precision': 0.6432050657574282, 'recall': 0.633637236084453, 'f1-score': 0.6383853033599227, 'support': 4168.0}    | {'precision': 0.8838137472283814, 'recall': 0.9261152416356877, 'f1-score': 0.9044701611073291, 'support': 2152.0} | {'precision': 0.9999116061168567, 'recall': 1.0, 'f1-score': 0.9999558011049724, 'support': 11312.0}                | {'precision': 0.8931094672097083, 'recall': 0.8900024848836247, 'f1-score': 0.8915532691669432, 'support': 12073.0} | 0.8985   | {'precision': 0.8550099715780937, 'recall': 0.8624387406509414, 'f1-score': 0.8585911336847918, 'support': 29705.0} | {'precision': 0.8980426387520326, 'recall': 0.8985356000673287, 'f1-score': 0.8982471762955423, 'support': 29705.0} |
| No log        | 11.0  | 451  | 0.3769          | {'precision': 0.6255558155862392, 'recall': 0.6413147792706334, 'f1-score': 0.6333372823125222, 'support': 4168.0}   | {'precision': 0.8893424036281179, 'recall': 0.9112453531598513, 'f1-score': 0.9001606610052789, 'support': 2152.0} | {'precision': 0.9999116061168567, 'recall': 1.0, 'f1-score': 0.9999558011049724, 'support': 11312.0}                | {'precision': 0.8938223938223938, 'recall': 0.8820508572848504, 'f1-score': 0.88789761120607, 'support': 12073.0}   | 0.8953   | {'precision': 0.8521580547884019, 'recall': 0.8586527474288339, 'f1-score': 0.8553378389072108, 'support': 29705.0} | {'precision': 0.896256500285568, 'recall': 0.8953038209055715, 'f1-score': 0.8957408994227329, 'support': 29705.0}  |
| No log        | 12.0  | 492  | 0.3879          | {'precision': 0.6307230422817113, 'recall': 0.6048464491362764, 'f1-score': 0.6175137783221066, 'support': 4168.0}   | {'precision': 0.8988711194731891, 'recall': 0.8880111524163569, 'f1-score': 0.8934081346423562, 'support': 2152.0} | {'precision': 0.9999116061168567, 'recall': 1.0, 'f1-score': 0.9999558011049724, 'support': 11312.0}                | {'precision': 0.8821419838617655, 'recall': 0.8964631823076286, 'f1-score': 0.8892449264645469, 'support': 12073.0} | 0.8944   | {'precision': 0.8529119379333807, 'recall': 0.8473301959650654, 'f1-score': 0.8500306601334956, 'support': 29705.0} | {'precision': 0.892924576633343, 'recall': 0.8943612186500589, 'f1-score': 0.8935790524525439, 'support': 29705.0}  |


### Framework versions

- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2