File size: 2,442 Bytes
4e58e5f
ca22ea2
 
 
4e58e5f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5717663
4e58e5f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
base_model:
- dslim/bert-base-NER
pipeline_tag: token-classification
tags:
- token-classification
- pytorch
- transformers
- named-entity-recognition
metrics:
- seqeval
---

# bert-base-mountain-NER

This model is a specialized adaptation of [dslim/bert-base-NER](https://huggingface.co/dslim/bert-base-NER), tailored for recognizing mountain names with a focus on geographical texts. Unlike the original, this model retains all 12 hidden layers and has been specifically fine-tuned to achieve high precision in identifying mountain-related entities across diverse texts.

It is ideal for applications that involve extracting geographic information from travel literature, research documents, or any content related to natural landscapes.

## Dataset

The model was trained using approximately 115 samples generated specifically for mountain name recognition. These samples were created with the assistance of ChatGPT, focusing on realistic use cases for mountain-related content in the NER format.

## How to Use

You can easily integrate this model with the Transformers library's NER pipeline:

```python
import torch
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

device = "cuda" if torch.cuda.is_available() else "cpu"

# Load model and tokenizer
model_name = "Lizrek/bert-base-mountain-NER"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)

# Create a pipeline for NER
nlp = pipeline("ner", model=model, tokenizer=tokenizer)

# Example usage
example = "Mount Fuji in Japan are example of volcanic mountain.."
ner_results = nlp(example)
print(ner_results)
```

## Example Output

For the above input, the model provides the following output:

```python
[{'entity': 'B-MOUNTAIN_NAME', 'score': np.float32(0.9827131), 'index': 1, 'word': 'Mount', 'start': 0, 'end': 5}, {'entity': 'I-MOUNTAIN_NAME', 'score': np.float32(0.98952174), 'index': 2, 'word': 'Fuji', 'start': 6, 'end': 10}]
```

This output highlights recognized mountain names, providing metadata such as entity type, confidence score, and word position.

## Limitations

- The model is specialized for mountain names and may not be effective in recognizing other types of geographical entities such as rivers or lakes.
- If the input text is significantly different from the training data in style or terminology, accuracy may be affected.