File size: 4,721 Bytes

df2042f
0dfb46a
 
df2042f
0dfb46a
df2042f
 
48040bb
 
 
 
 
 
 
 
0dfb46a
 
 
 
df2042f
 
 
 
 
48040bb
 
 
df2042f
b6a47d9
 
5d93b89
df2042f
96033cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48040bb
df2042f
 
48040bb
8b1e3d1
48040bb
 
 
 
 
 
 
 
 
 
 
 
 
 
8b1e3d1
df2042f
 
 
48040bb
 
 
 
 
df2042f
 
 
48040bb
 
8b1e3d1
48040bb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8b1e3d1
df2042f
 
 
 
 
 
 
 
 
 
 
 
 
 
7609a59
f7969d5
6d7c924
6fbee03
b46a771
6dcdecd
06b1f28
03f2a75
08943d1
f5bd078
ecc8bbd
359ab79
409d507
b082a73
4bd70ed
630fff0
cd40fda
b6a47d9
5d93b89
df2042f
 
 
 
 
 
 
48040bb

---
language:
- en
license: apache-2.0
library_name: transformers
tags:
- generated_from_keras_callback
- named entity recognition
- bert-base finetuned
- umair akram
datasets:
- conll2003
metrics:
- seqeval
pipeline_tag: token-classification
base_model: bert-base-cased
model-index:
- name: MUmairAB/bert-ner
  results: []
---


# MUmairAB/bert-ner

The model training notebook is available on my [GitHub Repo](https://github.com/MUmairAB/BERT-based-NER-using-HuggingFace-Transformers/tree/main).

This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on [Cnoll2003](https://huggingface.co/datasets/conll2003) dataset.
It achieves the following results on the evaluation set:
- Train Loss: 0.0003
- Validation Loss: 0.0880
- Epoch: 19

## How to use this model

```
#Install the transformers library
!pip install transformers

#Import the pipeline
from transformers import pipeline

#Import the model from HuggingFace
checkpoint = "MUmairAB/bert-ner"
model = pipeline(task="token-classification",
                 model=checkpoint)

#Use the model
raw_text = "My name is umair and i work at Swits AI in Antarctica."
model(raw_text)

```

## Model description

Model: "tf_bert_for_token_classification"
```
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 bert (TFBertMainLayer)      multiple                  107719680 
                                                                 
 dropout_37 (Dropout)        multiple                  0         
                                                                 
 classifier (Dense)          multiple                  6921      
                                                                 
=================================================================
Total params: 107,726,601
Trainable params: 107,726,601
Non-trainable params: 0
_________________________________________________________________
```

## Intended uses & limitations

This model can be used for named entity recognition tasks. It is trained on [Conll2003](https://huggingface.co/datasets/conll2003) dataset. The model can classify four types of named entities:
1. persons,
2. locations,
3. organizations, and
4. names of miscellaneous entities that do not belong to the previous three groups.

## Training and evaluation data

The model is evaluated on [seqeval](https://github.com/chakki-works/seqeval) metric and the result is as follows:

```
{'LOC': {'precision': 0.9655361050328227,
  'recall': 0.9608056614044638,
  'f1': 0.9631650750341064,
  'number': 1837},
 'MISC': {'precision': 0.8789144050104384,
  'recall': 0.913232104121475,
  'f1': 0.8957446808510638,
  'number': 922},
 'ORG': {'precision': 0.9075144508670521,
  'recall': 0.9366144668158091,
  'f1': 0.9218348623853211,
  'number': 1341},
 'PER': {'precision': 0.962011771000535,
  'recall': 0.9761129207383279,
  'f1': 0.9690110482349771,
  'number': 1842},
 'overall_precision': 0.9374068554396423,
 'overall_recall': 0.9527095254123191,
 'overall_f1': 0.944996244053084,
 'overall_accuracy': 0.9864013657502796}
```

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 17560, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32

### Training results

| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 0.1775     | 0.0635          | 0     |
| 0.0470     | 0.0559          | 1     |
| 0.0278     | 0.0603          | 2     |
| 0.0174     | 0.0603          | 3     |
| 0.0124     | 0.0615          | 4     |
| 0.0077     | 0.0722          | 5     |
| 0.0060     | 0.0731          | 6     |
| 0.0038     | 0.0757          | 7     |
| 0.0043     | 0.0731          | 8     |
| 0.0041     | 0.0735          | 9     |
| 0.0019     | 0.0724          | 10    |
| 0.0019     | 0.0786          | 11    |
| 0.0010     | 0.0843          | 12    |
| 0.0008     | 0.0814          | 13    |
| 0.0011     | 0.0867          | 14    |
| 0.0008     | 0.0883          | 15    |
| 0.0005     | 0.0861          | 16    |
| 0.0005     | 0.0869          | 17    |
| 0.0003     | 0.0880          | 18    |
| 0.0003     | 0.0880          | 19    |


### Framework versions

- Transformers 4.30.2
- TensorFlow 2.12.0
- Datasets 2.13.1
- Tokenizers 0.13.3