File size: 1,930 Bytes
bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e425069 bccc26c 407371c bccc26c 407371c bccc26c 407371c bccc26c 407371c bccc26c 407371c e74ce20 bccc26c e74ce20 bccc26c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
---
language: bn
tags:
- collaborative
- bengali
- NER
license: apache-2.0
datasets: xtreme
metrics:
- Loss
- Accuracy
- Precision
- Recall
---
# sahajBERT Named Entity Recognition
## Model description
[sahajBERT](https://huggingface.co/neuropark/sahajBERT-NER) fine-tuned for NER using the bengali split of [WikiANN ](https://huggingface.co/datasets/wikiann).
Named Entities predicted by the model:
| Label id | Label |
|:--------:|:----:|
|0 |O|
|1 |B-PER|
|2 |I-PER|
|3 |B-ORG|
|4 |I-ORG|
|5 |B-LOC|
|6 |I-LOC|
## Intended uses & limitations
#### How to use
You can use this model directly with a pipeline for token classification:
```python
from transformers import AlbertForTokenClassification, TokenClassificationPipeline, PreTrainedTokenizerFast
# Initialize tokenizer
tokenizer = PreTrainedTokenizerFast.from_pretrained("neuropark/sahajBERT-NER")
# Initialize model
model = AlbertForTokenClassification.from_pretrained("neuropark/sahajBERT-NER")
# Initialize pipeline
pipeline = TokenClassificationPipeline(tokenizer=tokenizer, model=model)
raw_text = "এই ইউনিয়নে ৩ টি মৌজা ও ১০ টি গ্রাম আছে ।" # Change me
output = pipeline(raw_text)
```
#### Limitations and bias
<!-- Provide examples of latent issues and potential remediations. -->
WIP
## Training data
The model was initialized with pre-trained weights of [sahajBERT](https://huggingface.co/neuropark/sahajBERT-NER) at step 19519 and trained on the bengali split of [WikiANN ](https://huggingface.co/datasets/wikiann)
## Training procedure
Coming soon!
<!-- ```bibtex
@inproceedings{...,
year={2020}
}
``` -->
## Eval results
loss: 0.11714419722557068
accuracy: 0.9772286821705426
precision: 0.9585365853658536
recall: 0.9651277013752456
f1 : 0.9618208516886931
### BibTeX entry and citation info
Coming soon!
<!-- ```bibtex
@inproceedings{...,
year={2020}
}
``` -->
|