Easter-Island/coref_classifier_ancor
Table of Contents
Model Details
Model Description: This model is a state-of-the-art language model for French coreference resolution.
Developed by: Grégory Guichard
Model Type: Token Classification
Language(s): French
License: MIT
Parent Model: See the Camembert-large model for more information about the RoBERTa base model.
Resources for more information:
Uses
This model can be used for Coreference token classification tasks. The model evaluates, for each token, if it is a reference of the expression between "<>".
Example
from transformers import pipeline
classifier = pipeline("ner", model=model, tokenizer=tokenizer)
text = "Un homme me parle. <Il> est beau."
[elem['word'] for elem in classifier(text) if elem['entity'] == 'LABEL_1']
# results
['▁Un', '▁homme']
This coreference resolver can perform many tasks
Reprise pronominale
from transformers import AutoModelForTokenClassification, AutoTokenizer
model = AutoModelForTokenClassification.from_pretrained("models/merged/ancor_classifier")
tokenizer = AutoTokenizer.from_pretrained("models/merged/ancor_classifier_tokenizer")
from transformers import pipeline
classifier = pipeline("ner", model=model, tokenizer=tokenizer)
text = "Platon est un philosophe antique de la Grèce classique... Il reprit le travail philosophique decertains de <ses> prédécesseurs"
[elem['word'] for elem in classifier(text) if elem['entity'] == 'LABEL_1']
# results
['▁Platon']
text = "Platon est un philosophe antique de la Grèce classique... <Il> reprit le travail philosophique decertains de ses prédécesseurs"
[elem['word'] for elem in classifier(text) if elem['entity'] == 'LABEL_1']
# results
['▁Platon', '▁un', '▁philosophe', '▁antique', '▁de','▁la', '▁Grèce', '▁classique', '▁ses']
Anaphores fidèles
from transformers import pipeline
classifier = pipeline("ner", model=model, tokenizer=tokenizer)
text = "Le chat que j’ai adopté court partout... Mais j’aime beaucoup <ce chat> ."
[elem['word'] for elem in classifier(text) if elem['entity'] == 'LABEL_1']
# results
['▁Le', '▁chat']
Anaphores infidèles
from transformers import pipeline
classifier = pipeline("ner", model=model, tokenizer=tokenizer)
text = "Le chat que j’ai adopté court partout... Mais j’aime beaucoup <cet animal> ."
[elem['word'] for elem in classifier(text) if elem['entity'] == 'LABEL_1']
# results
['▁Le', '▁chat']
Paroles rapportées
from transformers import pipeline
classifier = pipeline("ner", model=model, tokenizer=tokenizer)
text = """Lionel Jospin se livre en revanche à une longue analyse de son échec du 21 avril. “Ma part de responsabilité dans l’échec existe forcément. <Je> l’ai assumée en quittant la vie politique”"""
[elem['word'] for elem in classifier(text) if elem['entity'] == 'LABEL_1']
# results
['▁Lionel', '▁Jos', 'pin', '▁son', 'Ma']
Entités nommées
from transformers import pipeline
classifier = pipeline("ner", model=model, tokenizer=tokenizer)
text = "Paris est située sur la Seine. <La plus grande ville de France> compte plus de 10 millions d’habitants."
[elem['word'] for elem in classifier(text) if elem['entity'] == 'LABEL_1']
# results
['▁Paris']
Les groupes
from transformers import pipeline
classifier = pipeline("ner", model=model, tokenizer=tokenizer)
text = "Jack et Rose commencent à faire connaissance. Ils s’entendent bien. <Le couple> se marie et a des enfants."
[elem['word'] for elem in classifier(text) if elem['entity'] == 'LABEL_1']
# results
['▁Jack', '▁et', '▁Rose', '▁Ils']
Groupes dispersés
from transformers import pipeline
classifier = pipeline("ner", model=model, tokenizer=tokenizer)
text = "Pierre retrouva sa femme au restaurant. <Le couple> dina jusqu'à tard dans la nuit."
[elem['word'] for elem in classifier(text) if elem['entity'] == 'LABEL_1']
# results
['▁sa', '▁femme'] # ici il y a une erreur, on devrait avoir "Pierre" également
Risks, Limitations and Biases
Training
Training Data
Training Procedure
Evaluation
Citation Information
How to Get Started With the Model
- Downloads last month
- 676
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.