huseyincenik/conll_ner_with_bert
This model is a fine-tuned version of bert-base-uncased on the CoNLL-2003 dataset for Named Entity Recognition (NER).
Model description
This model has been trained to perform Named Entity Recognition (NER) and is based on the BERT architecture. It was fine-tuned on the CoNLL-2003 dataset, a standard dataset for NER tasks.
Intended uses & limitations
Intended Uses
- Named Entity Recognition: This model is designed to identify and classify named entities in text into categories such as location (LOC), organization (ORG), person (PER), and miscellaneous (MISC).
Limitations
- Domain Specificity: The model was fine-tuned on the CoNLL-2003 dataset, which consists of news articles. It may not generalize well to other domains or types of text not represented in the training data.
- Subword Tokens: The model may occasionally tag subword tokens as entities, requiring post-processing to handle these cases.
Training and evaluation data
Training Dataset: CoNLL-2003
Training Evaluation Metrics:
Label Precision Recall F1-Score Support B-PER 0.98 0.98 0.98 11273 I-PER 0.98 0.99 0.99 9323 B-ORG 0.88 0.92 0.90 10447 I-ORG 0.81 0.92 0.86 5137 B-LOC 0.86 0.94 0.90 9621 I-LOC 1.00 0.08 0.14 1267 B-MISC 0.81 0.73 0.77 4793 I-MISC 0.83 0.36 0.50 1329 Micro Avg 0.90 0.90 0.90 53190 Macro Avg 0.89 0.74 0.75 53190 Weighted Avg 0.90 0.90 0.89 53190 Validation Evaluation Metrics:
Label Precision Recall F1-Score Support B-PER 0.97 0.98 0.97 3018 I-PER 0.98 0.98 0.98 2741 B-ORG 0.86 0.91 0.88 2056 I-ORG 0.77 0.81 0.79 900 B-LOC 0.86 0.94 0.90 2618 I-LOC 1.00 0.10 0.18 281 B-MISC 0.77 0.74 0.76 1231 I-MISC 0.77 0.34 0.48 390 Micro Avg 0.90 0.89 0.89 13235 Macro Avg 0.87 0.73 0.74 13235 Weighted Avg 0.90 0.89 0.88 13235 Test Evaluation Metrics:
Label Precision Recall F1-Score Support B-PER 0.96 0.95 0.96 2714 I-PER 0.98 0.99 0.98 2487 B-ORG 0.81 0.87 0.84 2588 I-ORG 0.74 0.87 0.80 1050 B-LOC 0.81 0.90 0.85 2121 I-LOC 0.89 0.12 0.22 276 B-MISC 0.75 0.67 0.71 996 I-MISC 0.85 0.49 0.62 241 Micro Avg 0.87 0.88 0.87 12473 Macro Avg 0.85 0.73 0.75 12473 Weighted Avg 0.87 0.88 0.86 12473
Training procedure
Training Hyperparameters
Optimizer: AdamWeightDecay
- Learning Rate: 2e-05
- Decay Schedule: PolynomialDecay
- Warmup Steps: 0.1
- Weight Decay Rate: 0.01
training_precision: float32
Training results
Train Loss | Validation Loss | Epoch |
---|---|---|
0.1016 | 0.0254 | 0 |
0.0228 | 0.0180 | 1 |
Optimizer Details
from transformers import create_optimizer
batch_size = 32
num_train_epochs = 2
num_train_steps = (len(tokenized_conll["train"]) // batch_size) * num_train_epochs
optimizer, lr_schedule = create_optimizer(
init_lr=2e-5,
num_train_steps=num_train_steps,
weight_decay_rate=0.01,
num_warmup_steps=0.1
)
How to Use
Using a Pipeline
from transformers import pipeline
pipe = pipeline("token-classification", model="huseyincenik/conll_ner_with_bert")
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("huseyincenik/conll_ner_with_bert")
model = AutoModelForTokenClassification.from_pretrained("huseyincenik/conll_ner_with_bert")
Abbreviation | Description |
---|---|
O | Outside of a named entity |
B-MISC | Beginning of a miscellaneous entity right after another miscellaneous entity |
I-MISC | Miscellaneous entity |
B-PER | Beginning of a person’s name right after another person’s name |
I-PER | Person’s name |
B-ORG | Beginning of an organization right after another organization |
I-ORG | organization |
B-LOC | Beginning of a location right after another location |
I-LOC | Location |
CoNLL-2003 English Dataset Statistics
This dataset was derived from the Reuters corpus which consists of Reuters news stories. You can read more about how this dataset was created in the CoNLL-2003 paper.
# of training examples per entity type
Dataset | LOC | MISC | ORG | PER |
---|---|---|---|---|
Train | 7140 | 3438 | 6321 | 6600 |
Dev | 1837 | 922 | 1341 | 1842 |
Test | 1668 | 702 | 1661 | 1617 |
# of articles/sentences/tokens per dataset
Dataset | Articles | Sentences | Tokens |
---|---|---|---|
Train | 946 | 14,987 | 203,621 |
Dev | 216 | 3,466 | 51,362 |
Test | 231 | 3,684 | 46,435 |
Framework versions
- Transformers 4.45.0.dev0
- TensorFlow 2.17.0
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for huseyincenik/conll_ner_with_bert
Base model
google-bert/bert-base-uncased