File size: 2,131 Bytes
36151b0 d7f80ce d176b74 d7f80ce 36151b0 aefef28 973f6da d7f80ce 973f6da 3c33f7d 276c7a6 d7f80ce aefef28 2369971 d176b74 f877f67 d176b74 973f6da d176b74 f877f67 f972596 f877f67 d176b74 79aa539 d176b74 a939888 7db6daa a939888 7db6daa |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 |
---
license: mit
datasets:
- numind/NuNER
library_name: gliner
language:
- en
pipeline_tag: token-classification
tags:
- entity recognition
- NER
- named entity recognition
- zero shot
- zero-shot
---
NuNerZero - is the family of Zero-Shot Entity Recognition models inspired by [GLiNER](https://huggingface.co/papers/2311.08526) and built with insights we gathered throughout our work on [NuNER](https://huggingface.co/collections/numind/nuner-token-classification-and-ner-backbones-65e1f6e14639e2a465af823b).
NuNerZero span is:
* a more powerful version of GLiNER-large-v2.1, surpassing it by **+4.5% on average**
* is trained on the **diverse dataset tailored for real-life use cases** - NuNER v2.0 dataset
<p align="center">
<img src="zero_shot_performance_span.png">
</p>
## Installation & Usage
```
!pip install gliner
```
**NuZero requires labels to be lower-cased**
```python
from gliner import GLiNER
model = GLiNER.from_pretrained("numind/NuNerZero_span")
# NuZero requires labels to be lower-cased!
labels = ["person", "award", "date", "competitions", "teams"]
labels [l.lower() for l in labels]
text = """
"""
entities = model.predict_entities(text, labels)
for entity in entities:
print(entity["text"], "=>", entity["label"])
```
## Fine-tuning
A fine-tuning script can be found [here](https://colab.research.google.com/drive/1fu15tWCi0SiQBBelwB-dUZDZu0RVfx_a?usp=sharing).
## Citation
### This work
```bibtex
@misc{bogdanov2024nuner,
title={NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data},
author={Sergei Bogdanov and Alexandre Constantin and Timothée Bernard and Benoit Crabbé and Etienne Bernard},
year={2024},
eprint={2402.15343},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
### Previous work
```bibtex
@misc{zaratiana2023gliner,
title={GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer},
author={Urchade Zaratiana and Nadi Tomeh and Pierre Holat and Thierry Charnois},
year={2023},
eprint={2311.08526},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
``` |