longformer-sep_tok / meta_data /README_s42_e11.md
Theoreticallyhugo's picture
Training in progress, epoch 1
251f723 verified
|
raw
history blame
12.3 kB
---
license: apache-2.0
base_model: allenai/longformer-base-4096
tags:
- generated_from_trainer
datasets:
- essays_su_g
metrics:
- accuracy
model-index:
- name: longformer-sep_tok
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: essays_su_g
type: essays_su_g
config: sep_tok
split: train[80%:100%]
args: sep_tok
metrics:
- name: Accuracy
type: accuracy
value: 0.8955394714694496
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# longformer-sep_tok
This model is a fine-tuned version of [allenai/longformer-base-4096](https://huggingface.co/allenai/longformer-base-4096) on the essays_su_g dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3410
- Claim: {'precision': 0.6256446319737459, 'recall': 0.6403550863723608, 'f1-score': 0.6329143941190419, 'support': 4168.0}
- Majorclaim: {'precision': 0.9062196307094267, 'recall': 0.866635687732342, 'f1-score': 0.8859857482185273, 'support': 2152.0}
- O: {'precision': 0.9999115904871364, 'recall': 0.9998231966053748, 'f1-score': 0.9998673915926268, 'support': 11312.0}
- Premise: {'precision': 0.8913007456503729, 'recall': 0.8910792677876253, 'f1-score': 0.8911899929586214, 'support': 12073.0}
- Accuracy: 0.8955
- Macro avg: {'precision': 0.8557691497051705, 'recall': 0.8494733096244258, 'f1-score': 0.8524893817222043, 'support': 29705.0}
- Weighted avg: {'precision': 0.8964667660387375, 'recall': 0.8955394714694496, 'f1-score': 0.8959591059935927, 'support': 29705.0}
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 11
### Training results
| Training Loss | Epoch | Step | Validation Loss | Claim | Majorclaim | O | Premise | Accuracy | Macro avg | Weighted avg |
|:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:--------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|
| No log | 1.0 | 41 | 0.4047 | {'precision': 0.4120455380095483, 'recall': 0.26919385796545103, 'f1-score': 0.325642141924249, 'support': 4168.0} | {'precision': 0.7177489177489178, 'recall': 0.38522304832713755, 'f1-score': 0.5013607499244028, 'support': 2152.0} | {'precision': 0.9997273223050355, 'recall': 0.9723302687411598, 'f1-score': 0.9858384870484898, 'support': 11312.0} | {'precision': 0.7869139966273188, 'recall': 0.9662884121593639, 'f1-score': 0.8674250873670905, 'support': 12073.0} | 0.8287 | {'precision': 0.7291089436727051, 'recall': 0.6482588967982781, 'f1-score': 0.670066616566058, 'support': 29705.0} | {'precision': 0.8103460570481619, 'recall': 0.8286820400605959, 'f1-score': 0.8099792232503951, 'support': 29705.0} |
| No log | 2.0 | 82 | 0.2642 | {'precision': 0.636955636955637, 'recall': 0.3754798464491363, 'f1-score': 0.47245283018867923, 'support': 4168.0} | {'precision': 0.7393258426966293, 'recall': 0.9172862453531598, 'f1-score': 0.8187474077146413, 'support': 2152.0} | {'precision': 0.9999115200849407, 'recall': 0.9990275813295615, 'f1-score': 0.9994693552666489, 'support': 11312.0} | {'precision': 0.8513859596263935, 'recall': 0.9362213203014992, 'f1-score': 0.8917906031796126, 'support': 12073.0} | 0.8801 | {'precision': 0.8068947398409001, 'recall': 0.8070037483583392, 'f1-score': 0.7956150490873954, 'support': 29705.0} | {'precision': 0.8697405189053876, 'recall': 0.8800875273522976, 'f1-score': 0.8686656494392229, 'support': 29705.0} |
| No log | 3.0 | 123 | 0.2391 | {'precision': 0.6173835125448028, 'recall': 0.6612284069097889, 'f1-score': 0.6385542168674699, 'support': 4168.0} | {'precision': 0.8279618701770313, 'recall': 0.8475836431226765, 'f1-score': 0.8376578645235362, 'support': 2152.0} | {'precision': 0.9999115122555526, 'recall': 0.998939179632249, 'f1-score': 0.9994251094503163, 'support': 11312.0} | {'precision': 0.9087501065008095, 'recall': 0.8834589580054667, 'f1-score': 0.8959260814783705, 'support': 12073.0} | 0.8937 | {'precision': 0.8385017503695491, 'recall': 0.8478025469175453, 'f1-score': 0.8428908180799233, 'support': 29705.0} | {'precision': 0.8967300955168085, 'recall': 0.8936542669584245, 'f1-score': 0.8950057606513586, 'support': 29705.0} |
| No log | 4.0 | 164 | 0.2481 | {'precision': 0.6269047619047619, 'recall': 0.6317178502879078, 'f1-score': 0.6293021032504779, 'support': 4168.0} | {'precision': 0.8501953973078593, 'recall': 0.9098513011152416, 'f1-score': 0.8790123456790123, 'support': 2152.0} | {'precision': 0.9999116061168567, 'recall': 1.0, 'f1-score': 0.9999558011049724, 'support': 11312.0} | {'precision': 0.8994028093195391, 'recall': 0.8856953532676219, 'f1-score': 0.8924964527168016, 'support': 12073.0} | 0.8953 | {'precision': 0.8441036436622543, 'recall': 0.8568161261676929, 'f1-score': 0.8501916756878161, 'support': 29705.0} | {'precision': 0.8958777898648119, 'recall': 0.8953374852718398, 'f1-score': 0.8955117128429093, 'support': 29705.0} |
| No log | 5.0 | 205 | 0.2598 | {'precision': 0.6341782074732166, 'recall': 0.5822936660268714, 'f1-score': 0.6071294559099436, 'support': 4168.0} | {'precision': 0.8946587537091988, 'recall': 0.8406133828996283, 'f1-score': 0.8667944417824629, 'support': 2152.0} | {'precision': 1.0, 'recall': 0.9992927864214993, 'f1-score': 0.9996462681287585, 'support': 11312.0} | {'precision': 0.8773103887826641, 'recall': 0.9121179491427152, 'f1-score': 0.8943756345177665, 'support': 12073.0} | 0.8939 | {'precision': 0.85153683749127, 'recall': 0.8335794461226784, 'f1-score': 0.8419864500847328, 'support': 29705.0} | {'precision': 0.8911741703586489, 'recall': 0.8938562531560343, 'f1-score': 0.8921613476368966, 'support': 29705.0} |
| No log | 6.0 | 246 | 0.2705 | {'precision': 0.5968630775752437, 'recall': 0.6756238003838771, 'f1-score': 0.6338059869457574, 'support': 4168.0} | {'precision': 0.897196261682243, 'recall': 0.8475836431226765, 'f1-score': 0.87168458781362, 'support': 2152.0} | {'precision': 1.0, 'recall': 0.9994695898161244, 'f1-score': 0.9997347245556637, 'support': 11312.0} | {'precision': 0.9010130494505495, 'recall': 0.8692951213451503, 'f1-score': 0.8848699464609417, 'support': 12073.0} | 0.8901 | {'precision': 0.8487680971770091, 'recall': 0.8479930386669572, 'f1-score': 0.8475238114439957, 'support': 29705.0} | {'precision': 0.895755671048318, 'recall': 0.8901195085002525, 'f1-score': 0.892428973383654, 'support': 29705.0} |
| No log | 7.0 | 287 | 0.3331 | {'precision': 0.6148930258405112, 'recall': 0.5309500959692899, 'f1-score': 0.5698467876915154, 'support': 4168.0} | {'precision': 0.9280045351473923, 'recall': 0.7606877323420075, 'f1-score': 0.8360572012257406, 'support': 2152.0} | {'precision': 0.9999115904871364, 'recall': 0.9998231966053748, 'f1-score': 0.9998673915926268, 'support': 11312.0} | {'precision': 0.8592586908142122, 'recall': 0.9274413981611861, 'f1-score': 0.8920490758444869, 'support': 12073.0} | 0.8873 | {'precision': 0.8505169605723131, 'recall': 0.8047256057694646, 'f1-score': 0.8244551140885924, 'support': 29705.0} | {'precision': 0.8835135491375496, 'recall': 0.8872917017337149, 'f1-score': 0.8838419435954323, 'support': 29705.0} |
| No log | 8.0 | 328 | 0.3164 | {'precision': 0.601327525409666, 'recall': 0.6955374280230326, 'f1-score': 0.6450105684725775, 'support': 4168.0} | {'precision': 0.8873751135331517, 'recall': 0.9079925650557621, 'f1-score': 0.8975654570509877, 'support': 2152.0} | {'precision': 1.0, 'recall': 0.999557991513437, 'f1-score': 0.9997789469030461, 'support': 11312.0} | {'precision': 0.9086593406593406, 'recall': 0.8561252381346807, 'f1-score': 0.881610371886728, 'support': 12073.0} | 0.8920 | {'precision': 0.8493404949005396, 'recall': 0.8648033056817281, 'f1-score': 0.8559913360783348, 'support': 29705.0} | {'precision': 0.8987782726817388, 'recall': 0.8919710486450093, 'f1-score': 0.894568132641749, 'support': 29705.0} |
| No log | 9.0 | 369 | 0.3299 | {'precision': 0.6149536737772032, 'recall': 0.6847408829174664, 'f1-score': 0.6479736632988989, 'support': 4168.0} | {'precision': 0.9139676113360324, 'recall': 0.8392193308550185, 'f1-score': 0.8750000000000001, 'support': 2152.0} | {'precision': 0.9999115748518879, 'recall': 0.9996463932107497, 'f1-score': 0.9997789664471067, 'support': 11312.0} | {'precision': 0.9021988284234654, 'recall': 0.8802286092934648, 'f1-score': 0.8910783162837498, 'support': 12073.0} | 0.8953 | {'precision': 0.8577579220971472, 'recall': 0.8509588040691748, 'f1-score': 0.8534577365074388, 'support': 29705.0} | {'precision': 0.8999572934583262, 'recall': 0.8953038209055715, 'f1-score': 0.8971971859812554, 'support': 29705.0} |
| No log | 10.0 | 410 | 0.3392 | {'precision': 0.6190580985915493, 'recall': 0.6749040307101728, 'f1-score': 0.6457759412304866, 'support': 4168.0} | {'precision': 0.889295516925892, 'recall': 0.9033457249070632, 'f1-score': 0.8962655601659751, 'support': 2152.0} | {'precision': 0.9999115826702034, 'recall': 0.9997347949080623, 'f1-score': 0.9998231809742728, 'support': 11312.0} | {'precision': 0.9038148306900986, 'recall': 0.8732709351445374, 'f1-score': 0.8882803943044907, 'support': 12073.0} | 0.8958 | {'precision': 0.8530200072194358, 'recall': 0.8628138714174589, 'f1-score': 0.8575362691688063, 'support': 29705.0} | {'precision': 0.8994026049971722, 'recall': 0.8957751220333278, 'f1-score': 0.8973090938274679, 'support': 29705.0} |
| No log | 11.0 | 451 | 0.3410 | {'precision': 0.6256446319737459, 'recall': 0.6403550863723608, 'f1-score': 0.6329143941190419, 'support': 4168.0} | {'precision': 0.9062196307094267, 'recall': 0.866635687732342, 'f1-score': 0.8859857482185273, 'support': 2152.0} | {'precision': 0.9999115904871364, 'recall': 0.9998231966053748, 'f1-score': 0.9998673915926268, 'support': 11312.0} | {'precision': 0.8913007456503729, 'recall': 0.8910792677876253, 'f1-score': 0.8911899929586214, 'support': 12073.0} | 0.8955 | {'precision': 0.8557691497051705, 'recall': 0.8494733096244258, 'f1-score': 0.8524893817222043, 'support': 29705.0} | {'precision': 0.8964667660387375, 'recall': 0.8955394714694496, 'f1-score': 0.8959591059935927, 'support': 29705.0} |
### Framework versions
- Transformers 4.37.2
- Pytorch 2.2.0+cu121
- Datasets 2.17.0
- Tokenizers 0.15.2