--- language: - sw license: apache-2.0 datasets: - wikiann pipeline_tag: token-classification examples: null widget: - text: Serikali imetangaza hali ya janga katika wilaya 10 za kusini ambazo zimeathiriwa zaidi na dhoruba. example_title: Sentence_1 - text: Faida tano za kula samaki wenye mafuta. example_title: Sentence_2 - text: Tahadhari yatolewa kuhusu uwezekano wa mlipuko wa Volkano DR Congo. example_title: Sentence_3 metrics: - accuracy - f1 - precision - recall library_name: transformers --- ## Intended uses & limitations #### How to use You can use this model with Transformers *pipeline* for NER. ```python from transformers import pipeline from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("eolang/Swahili-NER-BertBase-Cased") model = AutoModelForTokenClassification.from_pretrained("eolang/Swahili-NER-BertBase-Cased") nlp = pipeline("ner", model=model, tokenizer=tokenizer) example = "Kwa nini Kenya inageukia mazao ya GMO kukabiliana na ukame" ner_results = nlp(example) print(ner_results) ``` ## Training data This model was fine-tuned on the Swahili Version of the WikiAnn dataset for cross-lingual name tagging and linking based on Wikipedia articles in 295 languages ## Training procedure This model was trained on a single NVIDIA A 5000 GPU with recommended hyperparameters from the [original BERT paper](https://arxiv.org/pdf/1810.04805) which trained & evaluated the model on CoNLL-2003 NER task.