File size: 5,445 Bytes

---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- twitter_pos_vcb
metrics:
- accuracy
- poseval
- f1
- recall
- precision
model-index:
- name: bert-base-cased-finetuned-Stromberg_NLP_Twitter-PoS_v2
  results:
  - task:
      name: Token Classification
      type: token-classification
    dataset:
      name: twitter_pos_vcb
      type: twitter_pos_vcb
      config: twitter-pos-vcb
      split: train
      args: twitter-pos-vcb
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.9853480683735223
language:
- en
pipeline_tag: token-classification
---

# bert-base-cased-finetuned-Stromberg_NLP_Twitter-PoS_v2

This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the twitter_pos_vcb dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0502

| Token | Precision | Recall | F1-Score | Support |
|:-----:|:-----:|:-----:|:-----:|:-----:|
| $ | 0.0 | 0.0 | 0.0 | 3
| '' | 0.9312320916905444 | 0.9530791788856305 | 0.9420289855072465 | 341 |
| ( | 0.9791666666666666 | 0.9591836734693877 | 0.9690721649484536 | 196 |
| ) | 0.960167714884696 | 0.9703389830508474 | 0.9652265542676501 | 472 |
| , | 0.9988979501873485 | 0.9993384785005512 | 0.9991181657848325 | 4535 |
| . | 0.9839189708141322 | 0.9894762249577601 | 0.9866897730281368 | 20715 |
| : | 0.9926405887528997 | 0.9971072719967858 | 0.9948689168604183 | 12445 |
| Cc | 0.9991067440821796 | 0.9986607142857142 | 0.9988836793927215 | 4480 |
| Cd | 0.9903884661593912 | 0.9899919935948759 | 0.9901901901901902 | 2498 |
| Dt | 0.9981148589510537 | 0.9976446837146703 | 0.9978797159492478 | 14860 |
| Ex | 0.9142857142857143 | 0.9846153846153847 | 0.9481481481481482 | 65 |
| Fw | 1.0 | 0.1 | 0.18181818181818182 | 10 |
| Ht | 0.999877541023757 | 0.9997551120362435 | 0.9998163227820978 | 8167 |
| In | 0.9960399353003514 | 0.9954846981437092 | 0.9957622393219583 | 17939 |
| Jj | 0.9812470698546648 | 0.9834756049808129 | 0.9823600735322877 | 12769 |
| Jjr | 0.9304511278195489 | 0.9686888454011742 | 0.9491850431447747 | 511 |
| Jjs | 0.9578414839797639 | 0.9726027397260274 | 0.9651656754460493 | 584 |
| Md | 0.9901398761751892 | 0.9908214777420835 | 0.990480559697213 | 4358 |
| Nn | 0.9810285563194078 | 0.9819697621331922 | 0.9814989335846437 | 30227 |
| Nnp | 0.9609722697706266 | 0.9467116357504216 | 0.9537886510363575 | 8895 |
| Nnps | 1.0 | 0.037037037037037035 | 0.07142857142857142 | 27 |
| Nns | 0.9697771061579146 | 0.9776564681985528 | 0.9737008471361739 | 7877 |
| Pos | 0.9977272727272727 | 0.984304932735426 | 0.9909706546275394 | 446 |
| Prp | 0.9983503349829983 | 0.9985184187487373 | 0.9984343697917544 | 29698 |
| Prp$ | 0.9974262182566919 | 0.9974262182566919 | 0.9974262182566919 | 5828 |
| Rb | 0.9939770374552983 | 0.9929802569727358 | 0.9934783971906942 | 15955 |
| Rbr | 0.9058823529411765 | 0.8191489361702128 | 0.8603351955307263 | 94 |
| Rbs | 0.92 | 1.0 | 0.9583333333333334 | 69 |
| Rp | 0.9802197802197802 | 0.9903774981495189 | 0.9852724594992636 | 1351 |
| Rt | 0.9995065383666419 | 0.9996298581122763 | 0.9995681944358769 | 8105 |
| Sym | 0.0 | 0.0 | 0.0 | 9 |
| To | 0.9984649496844619 | 0.9989761092150171 | 0.9987204640450398 | 5860 |
| Uh | 0.9614460148062687 | 0.9507510933637574 | 0.9560686457287633 | 10518 |
| Url | 1.0 | 0.9997242900468707 | 0.9998621260168207 | 3627 |
| Usr | 0.9999025388626285 | 1.0 | 0.9999512670565303 | 20519 |
| Vb | 0.9619302598929085 | 0.9570556133056133 | 0.9594867452615125 | 15392 |
| Vbd | 0.9592894152479645 | 0.9548719837907533 | 0.9570756023262255 | 5429 |
| Vbg | 0.9848831077518018 | 0.984191111891797 | 0.9845369882270251 | 5693 |
| Vbn | 0.9053408597481546 | 0.9164835164835164 | 0.910878112712975 | 2275 |
| Vbp | 0.963605718209626 | 0.9666228317364894 | 0.9651119169688633 | 15969 |
| Vbz | 0.9881780250347705 | 0.9861207494795281 | 0.9871483153872872 | 5764 |
| Wdt | 0.8666666666666667 | 0.9285714285714286 | 0.896551724137931 | 14 |
| Wp | 0.99125 | 0.993734335839599 | 0.9924906132665832 | 1596 |
| Wrb | 0.9963488843813387 | 0.9979683055668428 | 0.9971579374746244 | 2461 |
| `` | 0.9481865284974094 | 0.9786096256684492 | 0.963157894736842 | 187 |


Overall
- Accuracy: 0.9853
- Macro avg:
  - Precision: 0.9296417163691048
  - Recall: 0.8931046018294694
  - F1-score: 0.8930917459781836
  - Support: 308833
- Weighted avg:
  - Precision: 0.985306457604231
  - Recall: 0.9853480683735223
  - F1-Score: 0.9852689858931941
  - Support: 308833

## Model description

For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/Token%20Classification/Monolingual/StrombergNLP-Twitter_pos_vcb/NER%20Project%20Using%20StrombergNLP%20Twitter_pos_vcb%20Dataset%20with%20PosEval.ipynb.

## Intended uses & limitations

This model is intended to demonstrate my ability to solve a complex problem using technology.

## Training and evaluation data

Dataset Source: https://huggingface.co/datasets/strombergnlp/twitter_pos_vcb

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 2

### Training results

### Framework versions

- Transformers 4.28.1
- Pytorch 2.0.0
- Datasets 2.11.0
- Tokenizers 0.13.3