Lizrek commited on
Commit
4e58e5f
1 Parent(s): d69a0e5

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -0
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - token-classification
4
+ - pytorch
5
+ - transformers
6
+ - named-entity-recognition
7
+ widget:
8
+ - text: Mount Fuji in Japan are example of volcanic mountain.
9
+ pipeline_tag: token-classification
10
+ metrics:
11
+ - seqeval
12
+ base_model:
13
+ - dslim/bert-base-NER
14
+ ---
15
+
16
+ # bert-base-mountain-NER
17
+
18
+ This model is a specialized adaptation of [dslim/bert-base-NER](https://huggingface.co/dslim/bert-base-NER), tailored for recognizing mountain names with a focus on geographical texts. Unlike the original, this model retains all 12 hidden layers and has been specifically fine-tuned to achieve high precision in identifying mountain-related entities across diverse texts.
19
+
20
+ It is ideal for applications that involve extracting geographic information from travel literature, research documents, or any content related to natural landscapes.
21
+
22
+ ## Dataset
23
+
24
+ The model was trained using approximately 150 samples generated specifically for mountain name recognition. These samples were created with the assistance of ChatGPT, focusing on realistic use cases for mountain-related content in the NER format.
25
+
26
+ ## How to Use
27
+
28
+ You can easily integrate this model with the Transformers library's NER pipeline:
29
+
30
+ ```python
31
+ import torch
32
+ from transformers import AutoTokenizer, AutoModelForTokenClassification
33
+ from transformers import pipeline
34
+
35
+ device = "cuda" if torch.cuda.is_available() else "cpu"
36
+
37
+ # Load model and tokenizer
38
+ model_name = "Lizrek/bert-base-mountain-NER"
39
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
40
+ model = AutoModelForTokenClassification.from_pretrained(model_name)
41
+
42
+ # Create a pipeline for NER
43
+ nlp = pipeline("ner", model=model, tokenizer=tokenizer)
44
+
45
+ # Example usage
46
+ example = "Mount Fuji in Japan are example of volcanic mountain.."
47
+ ner_results = nlp(example)
48
+ print(ner_results)
49
+ ```
50
+
51
+ ## Example Output
52
+
53
+ For the above input, the model provides the following output:
54
+
55
+ ```python
56
+ [{'entity': 'B-MOUNTAIN_NAME', 'score': np.float32(0.9827131), 'index': 1, 'word': 'Mount', 'start': 0, 'end': 5}, {'entity': 'I-MOUNTAIN_NAME', 'score': np.float32(0.98952174), 'index': 2, 'word': 'Fuji', 'start': 6, 'end': 10}]
57
+ ```
58
+
59
+ This output highlights recognized mountain names, providing metadata such as entity type, confidence score, and word position.
60
+
61
+ ## Limitations
62
+
63
+ - The model is specialized for mountain names and may not be effective in recognizing other types of geographical entities such as rivers or lakes.
64
+ - If the input text is significantly different from the training data in style or terminology, accuracy may be affected.