update README
Browse files- README.md +24 -3
- assets/marefa-tebyan.png +0 -0
README.md
CHANGED
@@ -1,3 +1,4 @@
|
|
|
|
1 |
---
|
2 |
language: ar
|
3 |
datasets:
|
@@ -9,6 +10,9 @@ widget:
|
|
9 |
# Tebyan تبيـان
|
10 |
## Marefa Arabic Named Entity Recognition Model
|
11 |
## نموذج المعرفة لتصنيف أجزاء النص
|
|
|
|
|
|
|
12 |
---------
|
13 |
**Version**: 1.3
|
14 |
|
@@ -31,7 +35,7 @@ Person, Location, Organization, Nationality, Job, Product, Event, Time, Art-Work
|
|
31 |
|
32 |
*You can test the model quickly by checking this [Colab notebook](https://colab.research.google.com/drive/1OGp9Wgm-oBM5BBhTLx6Qow4dNRSJZ-F5?usp=sharing)*
|
33 |
|
34 |
-
|
35 |
|
36 |
Install the following Python packages
|
37 |
|
@@ -43,8 +47,6 @@ Install the following Python packages
|
|
43 |
-----------
|
44 |
|
45 |
```python
|
46 |
-
|
47 |
-
# ==== Set configurations
|
48 |
from transformers import AutoTokenizer, AutoModelForTokenClassification
|
49 |
import torch
|
50 |
|
@@ -170,6 +172,25 @@ Output
|
|
170 |
|
171 |
Check this [notebook](https://colab.research.google.com/drive/1WUYrnmDFFEItqGMvbyjqZEJJqwU7xQR-?usp=sharing) to fine-tune the NER model
|
172 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
173 |
## Acknowledgment شكر و تقدير
|
174 |
|
175 |
قام بإعداد البيانات التي تم تدريب النموذج عليها, مجموعة من المتطوعين الذين قضوا ساعات يقومون بتنقيح البيانات و مراجعتها
|
|
|
1 |
+
|
2 |
---
|
3 |
language: ar
|
4 |
datasets:
|
|
|
10 |
# Tebyan تبيـان
|
11 |
## Marefa Arabic Named Entity Recognition Model
|
12 |
## نموذج المعرفة لتصنيف أجزاء النص
|
13 |
+
|
14 |
+
![Marfa Arabic NER Model](/assets/marefa-tebyan-banner.png)
|
15 |
+
|
16 |
---------
|
17 |
**Version**: 1.3
|
18 |
|
|
|
35 |
|
36 |
*You can test the model quickly by checking this [Colab notebook](https://colab.research.google.com/drive/1OGp9Wgm-oBM5BBhTLx6Qow4dNRSJZ-F5?usp=sharing)*
|
37 |
|
38 |
+
----
|
39 |
|
40 |
Install the following Python packages
|
41 |
|
|
|
47 |
-----------
|
48 |
|
49 |
```python
|
|
|
|
|
50 |
from transformers import AutoTokenizer, AutoModelForTokenClassification
|
51 |
import torch
|
52 |
|
|
|
172 |
|
173 |
Check this [notebook](https://colab.research.google.com/drive/1WUYrnmDFFEItqGMvbyjqZEJJqwU7xQR-?usp=sharing) to fine-tune the NER model
|
174 |
|
175 |
+
## Evaluation
|
176 |
+
|
177 |
+
We tested the model agains a test set of 1959 sentences. The results is in the follwing table
|
178 |
+
|
179 |
+
| type | f1-score | precision | recall | support |
|
180 |
+
|:-------------|-----------:|------------:|---------:|----------:|
|
181 |
+
| person | 0.93298 | 0.931479 | 0.934487 | 4335 |
|
182 |
+
| location | 0.891537 | 0.896926 | 0.886212 | 4939 |
|
183 |
+
| time | 0.873003 | 0.876087 | 0.869941 | 1853 |
|
184 |
+
| nationality | 0.871246 | 0.843153 | 0.901277 | 2350 |
|
185 |
+
| job | 0.837656 | 0.79912 | 0.880097 | 2477 |
|
186 |
+
| organization | 0.781317 | 0.773328 | 0.789474 | 2299 |
|
187 |
+
| event | 0.686695 | 0.733945 | 0.645161 | 744 |
|
188 |
+
| artwork | 0.653552 | 0.678005 | 0.630802 | 474 |
|
189 |
+
| product | 0.625483 | 0.553531 | 0.718935 | 338 |
|
190 |
+
| **weighted avg** | 0.859008 | 0.852365 | 0.86703 | 19809 |
|
191 |
+
| **micro avg** | 0.858771 | 0.850669 | 0.86703 | 19809 |
|
192 |
+
| **macro avg** | 0.79483 | 0.787286 | 0.806265 | 19809 |
|
193 |
+
|
194 |
## Acknowledgment شكر و تقدير
|
195 |
|
196 |
قام بإعداد البيانات التي تم تدريب النموذج عليها, مجموعة من المتطوعين الذين قضوا ساعات يقومون بتنقيح البيانات و مراجعتها
|
assets/marefa-tebyan.png
ADDED