File size: 5,445 Bytes
1baf245 41b48c6 1baf245 41b48c6 1baf245 079fc18 1baf245 079fc18 1baf245 41b48c6 1baf245 41b48c6 1baf245 41b48c6 1baf245 41b48c6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- twitter_pos_vcb
metrics:
- accuracy
- poseval
- f1
- recall
- precision
model-index:
- name: bert-base-cased-finetuned-Stromberg_NLP_Twitter-PoS_v2
results:
- task:
name: Token Classification
type: token-classification
dataset:
name: twitter_pos_vcb
type: twitter_pos_vcb
config: twitter-pos-vcb
split: train
args: twitter-pos-vcb
metrics:
- name: Accuracy
type: accuracy
value: 0.9853480683735223
language:
- en
pipeline_tag: token-classification
---
# bert-base-cased-finetuned-Stromberg_NLP_Twitter-PoS_v2
This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the twitter_pos_vcb dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0502
| Token | Precision | Recall | F1-Score | Support |
|:-----:|:-----:|:-----:|:-----:|:-----:|
| $ | 0.0 | 0.0 | 0.0 | 3
| '' | 0.9312320916905444 | 0.9530791788856305 | 0.9420289855072465 | 341 |
| ( | 0.9791666666666666 | 0.9591836734693877 | 0.9690721649484536 | 196 |
| ) | 0.960167714884696 | 0.9703389830508474 | 0.9652265542676501 | 472 |
| , | 0.9988979501873485 | 0.9993384785005512 | 0.9991181657848325 | 4535 |
| . | 0.9839189708141322 | 0.9894762249577601 | 0.9866897730281368 | 20715 |
| : | 0.9926405887528997 | 0.9971072719967858 | 0.9948689168604183 | 12445 |
| Cc | 0.9991067440821796 | 0.9986607142857142 | 0.9988836793927215 | 4480 |
| Cd | 0.9903884661593912 | 0.9899919935948759 | 0.9901901901901902 | 2498 |
| Dt | 0.9981148589510537 | 0.9976446837146703 | 0.9978797159492478 | 14860 |
| Ex | 0.9142857142857143 | 0.9846153846153847 | 0.9481481481481482 | 65 |
| Fw | 1.0 | 0.1 | 0.18181818181818182 | 10 |
| Ht | 0.999877541023757 | 0.9997551120362435 | 0.9998163227820978 | 8167 |
| In | 0.9960399353003514 | 0.9954846981437092 | 0.9957622393219583 | 17939 |
| Jj | 0.9812470698546648 | 0.9834756049808129 | 0.9823600735322877 | 12769 |
| Jjr | 0.9304511278195489 | 0.9686888454011742 | 0.9491850431447747 | 511 |
| Jjs | 0.9578414839797639 | 0.9726027397260274 | 0.9651656754460493 | 584 |
| Md | 0.9901398761751892 | 0.9908214777420835 | 0.990480559697213 | 4358 |
| Nn | 0.9810285563194078 | 0.9819697621331922 | 0.9814989335846437 | 30227 |
| Nnp | 0.9609722697706266 | 0.9467116357504216 | 0.9537886510363575 | 8895 |
| Nnps | 1.0 | 0.037037037037037035 | 0.07142857142857142 | 27 |
| Nns | 0.9697771061579146 | 0.9776564681985528 | 0.9737008471361739 | 7877 |
| Pos | 0.9977272727272727 | 0.984304932735426 | 0.9909706546275394 | 446 |
| Prp | 0.9983503349829983 | 0.9985184187487373 | 0.9984343697917544 | 29698 |
| Prp$ | 0.9974262182566919 | 0.9974262182566919 | 0.9974262182566919 | 5828 |
| Rb | 0.9939770374552983 | 0.9929802569727358 | 0.9934783971906942 | 15955 |
| Rbr | 0.9058823529411765 | 0.8191489361702128 | 0.8603351955307263 | 94 |
| Rbs | 0.92 | 1.0 | 0.9583333333333334 | 69 |
| Rp | 0.9802197802197802 | 0.9903774981495189 | 0.9852724594992636 | 1351 |
| Rt | 0.9995065383666419 | 0.9996298581122763 | 0.9995681944358769 | 8105 |
| Sym | 0.0 | 0.0 | 0.0 | 9 |
| To | 0.9984649496844619 | 0.9989761092150171 | 0.9987204640450398 | 5860 |
| Uh | 0.9614460148062687 | 0.9507510933637574 | 0.9560686457287633 | 10518 |
| Url | 1.0 | 0.9997242900468707 | 0.9998621260168207 | 3627 |
| Usr | 0.9999025388626285 | 1.0 | 0.9999512670565303 | 20519 |
| Vb | 0.9619302598929085 | 0.9570556133056133 | 0.9594867452615125 | 15392 |
| Vbd | 0.9592894152479645 | 0.9548719837907533 | 0.9570756023262255 | 5429 |
| Vbg | 0.9848831077518018 | 0.984191111891797 | 0.9845369882270251 | 5693 |
| Vbn | 0.9053408597481546 | 0.9164835164835164 | 0.910878112712975 | 2275 |
| Vbp | 0.963605718209626 | 0.9666228317364894 | 0.9651119169688633 | 15969 |
| Vbz | 0.9881780250347705 | 0.9861207494795281 | 0.9871483153872872 | 5764 |
| Wdt | 0.8666666666666667 | 0.9285714285714286 | 0.896551724137931 | 14 |
| Wp | 0.99125 | 0.993734335839599 | 0.9924906132665832 | 1596 |
| Wrb | 0.9963488843813387 | 0.9979683055668428 | 0.9971579374746244 | 2461 |
| `` | 0.9481865284974094 | 0.9786096256684492 | 0.963157894736842 | 187 |
Overall
- Accuracy: 0.9853
- Macro avg:
- Precision: 0.9296417163691048
- Recall: 0.8931046018294694
- F1-score: 0.8930917459781836
- Support: 308833
- Weighted avg:
- Precision: 0.985306457604231
- Recall: 0.9853480683735223
- F1-Score: 0.9852689858931941
- Support: 308833
## Model description
For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/Token%20Classification/Monolingual/StrombergNLP-Twitter_pos_vcb/NER%20Project%20Using%20StrombergNLP%20Twitter_pos_vcb%20Dataset%20with%20PosEval.ipynb.
## Intended uses & limitations
This model is intended to demonstrate my ability to solve a complex problem using technology.
## Training and evaluation data
Dataset Source: https://huggingface.co/datasets/strombergnlp/twitter_pos_vcb
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 2
### Training results
### Framework versions
- Transformers 4.28.1
- Pytorch 2.0.0
- Datasets 2.11.0
- Tokenizers 0.13.3 |