--- license: apache-2.0 tags: - generated_from_trainer datasets: - twitter_pos_vcb metrics: - accuracy - poseval - f1 - recall - precision model-index: - name: bert-base-cased-finetuned-Stromberg_NLP_Twitter-PoS_v2 results: - task: name: Token Classification type: token-classification dataset: name: twitter_pos_vcb type: twitter_pos_vcb config: twitter-pos-vcb split: train args: twitter-pos-vcb metrics: - name: Accuracy type: accuracy value: 0.9853480683735223 language: - en pipeline_tag: token-classification --- # bert-base-cased-finetuned-Stromberg_NLP_Twitter-PoS_v2 This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on the twitter_pos_vcb dataset. It achieves the following results on the evaluation set: - Loss: 0.0502 | Token | Precision | Recall | F1-Score | Support | |:-----:|:-----:|:-----:|:-----:|:-----:| | $ | 0.0 | 0.0 | 0.0 | 3 | '' | 0.9312320916905444 | 0.9530791788856305 | 0.9420289855072465 | 341 | | ( | 0.9791666666666666 | 0.9591836734693877 | 0.9690721649484536 | 196 | | ) | 0.960167714884696 | 0.9703389830508474 | 0.9652265542676501 | 472 | | , | 0.9988979501873485 | 0.9993384785005512 | 0.9991181657848325 | 4535 | | . | 0.9839189708141322 | 0.9894762249577601 | 0.9866897730281368 | 20715 | | : | 0.9926405887528997 | 0.9971072719967858 | 0.9948689168604183 | 12445 | | Cc | 0.9991067440821796 | 0.9986607142857142 | 0.9988836793927215 | 4480 | | Cd | 0.9903884661593912 | 0.9899919935948759 | 0.9901901901901902 | 2498 | | Dt | 0.9981148589510537 | 0.9976446837146703 | 0.9978797159492478 | 14860 | | Ex | 0.9142857142857143 | 0.9846153846153847 | 0.9481481481481482 | 65 | | Fw | 1.0 | 0.1 | 0.18181818181818182 | 10 | | Ht | 0.999877541023757 | 0.9997551120362435 | 0.9998163227820978 | 8167 | | In | 0.9960399353003514 | 0.9954846981437092 | 0.9957622393219583 | 17939 | | Jj | 0.9812470698546648 | 0.9834756049808129 | 0.9823600735322877 | 12769 | | Jjr | 0.9304511278195489 | 0.9686888454011742 | 0.9491850431447747 | 511 | | Jjs | 0.9578414839797639 | 0.9726027397260274 | 0.9651656754460493 | 584 | | Md | 0.9901398761751892 | 0.9908214777420835 | 0.990480559697213 | 4358 | | Nn | 0.9810285563194078 | 0.9819697621331922 | 0.9814989335846437 | 30227 | | Nnp | 0.9609722697706266 | 0.9467116357504216 | 0.9537886510363575 | 8895 | | Nnps | 1.0 | 0.037037037037037035 | 0.07142857142857142 | 27 | | Nns | 0.9697771061579146 | 0.9776564681985528 | 0.9737008471361739 | 7877 | | Pos | 0.9977272727272727 | 0.984304932735426 | 0.9909706546275394 | 446 | | Prp | 0.9983503349829983 | 0.9985184187487373 | 0.9984343697917544 | 29698 | | Prp$ | 0.9974262182566919 | 0.9974262182566919 | 0.9974262182566919 | 5828 | | Rb | 0.9939770374552983 | 0.9929802569727358 | 0.9934783971906942 | 15955 | | Rbr | 0.9058823529411765 | 0.8191489361702128 | 0.8603351955307263 | 94 | | Rbs | 0.92 | 1.0 | 0.9583333333333334 | 69 | | Rp | 0.9802197802197802 | 0.9903774981495189 | 0.9852724594992636 | 1351 | | Rt | 0.9995065383666419 | 0.9996298581122763 | 0.9995681944358769 | 8105 | | Sym | 0.0 | 0.0 | 0.0 | 9 | | To | 0.9984649496844619 | 0.9989761092150171 | 0.9987204640450398 | 5860 | | Uh | 0.9614460148062687 | 0.9507510933637574 | 0.9560686457287633 | 10518 | | Url | 1.0 | 0.9997242900468707 | 0.9998621260168207 | 3627 | | Usr | 0.9999025388626285 | 1.0 | 0.9999512670565303 | 20519 | | Vb | 0.9619302598929085 | 0.9570556133056133 | 0.9594867452615125 | 15392 | | Vbd | 0.9592894152479645 | 0.9548719837907533 | 0.9570756023262255 | 5429 | | Vbg | 0.9848831077518018 | 0.984191111891797 | 0.9845369882270251 | 5693 | | Vbn | 0.9053408597481546 | 0.9164835164835164 | 0.910878112712975 | 2275 | | Vbp | 0.963605718209626 | 0.9666228317364894 | 0.9651119169688633 | 15969 | | Vbz | 0.9881780250347705 | 0.9861207494795281 | 0.9871483153872872 | 5764 | | Wdt | 0.8666666666666667 | 0.9285714285714286 | 0.896551724137931 | 14 | | Wp | 0.99125 | 0.993734335839599 | 0.9924906132665832 | 1596 | | Wrb | 0.9963488843813387 | 0.9979683055668428 | 0.9971579374746244 | 2461 | | `` | 0.9481865284974094 | 0.9786096256684492 | 0.963157894736842 | 187 | Overall - Accuracy: 0.9853 - Macro avg: - Precision: 0.9296417163691048 - Recall: 0.8931046018294694 - F1-score: 0.8930917459781836 - Support: 308833 - Weighted avg: - Precision: 0.985306457604231 - Recall: 0.9853480683735223 - F1-Score: 0.9852689858931941 - Support: 308833 ## Model description For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/Token%20Classification/Monolingual/StrombergNLP-Twitter_pos_vcb/NER%20Project%20Using%20StrombergNLP%20Twitter_pos_vcb%20Dataset%20with%20PosEval.ipynb. ## Intended uses & limitations This model is intended to demonstrate my ability to solve a complex problem using technology. ## Training and evaluation data Dataset Source: https://huggingface.co/datasets/strombergnlp/twitter_pos_vcb ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 2e-05 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 2 ### Training results ### Framework versions - Transformers 4.28.1 - Pytorch 2.0.0 - Datasets 2.11.0 - Tokenizers 0.13.3