File size: 1,539 Bytes
4ac0757 107b3b2 4ac0757 06e0c62 4ac0757 107b3b2 4ac0757 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
---
language:
- sw
license: apache-2.0
datasets:
- wikiann
pipeline_tag: token-classification
examples: null
widget:
- text: Serikali imetangaza hali ya janga katika wilaya 10 za kusini ambazo zimeathiriwa zaidi na dhoruba.
example_title: Sentence_1
- text: Asidi ya mafuta ya Omega-3 inachukuliwa kuwa muhimu kwa mwili wa binadamu.
example_title: Sentence_2
- text: Tahadhari yatolewa kuhusu uwezekano wa mlipuko wa Volkano DR Congo.
example_title: Sentence_3
metrics:
- accuracy
- f1
- precision
- recall
library_name: transformers
---
## Intended uses & limitations
#### How to use
You can use this model with Transformers *pipeline* for NER.
```python
from transformers import pipeline
from transformers import AutoTokenizer, AutoModelForTokenClassification
tokenizer = AutoTokenizer.from_pretrained("eolang/Swahili-NER-BertBase-Cased")
model = AutoModelForTokenClassification.from_pretrained("eolang/Swahili-NER-BertBase-Cased")
nlp = pipeline("ner", model=model, tokenizer=tokenizer)
example = "Kwa nini Kenya inageukia mazao ya GMO kukabiliana na ukame"
ner_results = nlp(example)
print(ner_results)
```
## Training data
This model was fine-tuned on the Swahili Version of the WikiAnn dataset for cross-lingual name tagging and linking based on Wikipedia articles in 295 languages
## Training procedure
This model was trained on a single NVIDIA A 5000 GPU with recommended hyperparameters from the [original BERT paper](https://arxiv.org/pdf/1810.04805) which trained & evaluated the model on CoNLL-2003 NER task. |