File size: 4,721 Bytes
df2042f
0dfb46a
 
df2042f
0dfb46a
df2042f
 
48040bb
 
 
 
 
 
 
 
0dfb46a
 
 
 
df2042f
 
 
 
 
48040bb
 
 
df2042f
b6a47d9
 
5d93b89
df2042f
96033cd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
48040bb
df2042f
 
48040bb
8b1e3d1
48040bb
 
 
 
 
 
 
 
 
 
 
 
 
 
8b1e3d1
df2042f
 
 
48040bb
 
 
 
 
df2042f
 
 
48040bb
 
8b1e3d1
48040bb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8b1e3d1
df2042f
 
 
 
 
 
 
 
 
 
 
 
 
 
7609a59
f7969d5
6d7c924
6fbee03
b46a771
6dcdecd
06b1f28
03f2a75
08943d1
f5bd078
ecc8bbd
359ab79
409d507
b082a73
4bd70ed
630fff0
cd40fda
b6a47d9
5d93b89
df2042f
 
 
 
 
 
 
48040bb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
---
language:
- en
license: apache-2.0
library_name: transformers
tags:
- generated_from_keras_callback
- named entity recognition
- bert-base finetuned
- umair akram
datasets:
- conll2003
metrics:
- seqeval
pipeline_tag: token-classification
base_model: bert-base-cased
model-index:
- name: MUmairAB/bert-ner
  results: []
---


# MUmairAB/bert-ner

The model training notebook is available on my [GitHub Repo](https://github.com/MUmairAB/BERT-based-NER-using-HuggingFace-Transformers/tree/main).

This model is a fine-tuned version of [bert-base-cased](https://huggingface.co/bert-base-cased) on [Cnoll2003](https://huggingface.co/datasets/conll2003) dataset.
It achieves the following results on the evaluation set:
- Train Loss: 0.0003
- Validation Loss: 0.0880
- Epoch: 19

## How to use this model

```
#Install the transformers library
!pip install transformers

#Import the pipeline
from transformers import pipeline

#Import the model from HuggingFace
checkpoint = "MUmairAB/bert-ner"
model = pipeline(task="token-classification",
                 model=checkpoint)

#Use the model
raw_text = "My name is umair and i work at Swits AI in Antarctica."
model(raw_text)

```

## Model description

Model: "tf_bert_for_token_classification"
```
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 bert (TFBertMainLayer)      multiple                  107719680 
                                                                 
 dropout_37 (Dropout)        multiple                  0         
                                                                 
 classifier (Dense)          multiple                  6921      
                                                                 
=================================================================
Total params: 107,726,601
Trainable params: 107,726,601
Non-trainable params: 0
_________________________________________________________________
```

## Intended uses & limitations

This model can be used for named entity recognition tasks. It is trained on [Conll2003](https://huggingface.co/datasets/conll2003) dataset. The model can classify four types of named entities:
1. persons,
2. locations,
3. organizations, and
4. names of miscellaneous entities that do not belong to the previous three groups.

## Training and evaluation data

The model is evaluated on [seqeval](https://github.com/chakki-works/seqeval) metric and the result is as follows:

```
{'LOC': {'precision': 0.9655361050328227,
  'recall': 0.9608056614044638,
  'f1': 0.9631650750341064,
  'number': 1837},
 'MISC': {'precision': 0.8789144050104384,
  'recall': 0.913232104121475,
  'f1': 0.8957446808510638,
  'number': 922},
 'ORG': {'precision': 0.9075144508670521,
  'recall': 0.9366144668158091,
  'f1': 0.9218348623853211,
  'number': 1341},
 'PER': {'precision': 0.962011771000535,
  'recall': 0.9761129207383279,
  'f1': 0.9690110482349771,
  'number': 1842},
 'overall_precision': 0.9374068554396423,
 'overall_recall': 0.9527095254123191,
 'overall_f1': 0.944996244053084,
 'overall_accuracy': 0.9864013657502796}
```

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 2e-05, 'decay_steps': 17560, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: float32

### Training results

| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 0.1775     | 0.0635          | 0     |
| 0.0470     | 0.0559          | 1     |
| 0.0278     | 0.0603          | 2     |
| 0.0174     | 0.0603          | 3     |
| 0.0124     | 0.0615          | 4     |
| 0.0077     | 0.0722          | 5     |
| 0.0060     | 0.0731          | 6     |
| 0.0038     | 0.0757          | 7     |
| 0.0043     | 0.0731          | 8     |
| 0.0041     | 0.0735          | 9     |
| 0.0019     | 0.0724          | 10    |
| 0.0019     | 0.0786          | 11    |
| 0.0010     | 0.0843          | 12    |
| 0.0008     | 0.0814          | 13    |
| 0.0011     | 0.0867          | 14    |
| 0.0008     | 0.0883          | 15    |
| 0.0005     | 0.0861          | 16    |
| 0.0005     | 0.0869          | 17    |
| 0.0003     | 0.0880          | 18    |
| 0.0003     | 0.0880          | 19    |


### Framework versions

- Transformers 4.30.2
- TensorFlow 2.12.0
- Datasets 2.13.1
- Tokenizers 0.13.3