french-camembert-postag-model / README.md

Migrate model card from transformers-repo

20dcf94 almost 4 years ago

4.95 kB

	---
	language: fr
	widget:
	- text: "Face à un choc inédit, les mesures mises en place par le gouvernement ont permis une protection forte et efficace des ménages"
	---

	## About

	The french-camembert-postag-model is a part of speech tagging model for French that was trained on the free-french-treebank dataset available on
	[github](https://github.com/nicolashernandez/free-french-treebank). The base tokenizer and model used for training is 'camembert-base'.

	## Supported Tags

	It uses the following tags:

	\| Tag \| Category \| Extra Info \|
	\|----------\|:------------------------------:\|------------:\|
	\| ADJ \| adjectif \| \|
	\| ADJWH \| adjectif \| \|
	\| ADV \| adverbe \| \|
	\| ADVWH \| adverbe \| \|
	\| CC \| conjonction de coordination \| \|
	\| CLO \| pronom \| obj \|
	\| CLR \| pronom \| refl \|
	\| CLS \| pronom \| suj \|
	\| CS \| conjonction de subordination \| \|
	\| DET \| déterminant \| \|
	\| DETWH \| déterminant \| \|
	\| ET \| mot étranger \| \|
	\| I \| interjection \| \|
	\| NC \| nom commun \| \|
	\| NPP \| nom propre \| \|
	\| P \| préposition \| \|
	\| P+D \| préposition + déterminant \| \|
	\| PONCT \| signe de ponctuation \| \|
	\| PREF \| préfixe \| \|
	\| PRO \| autres pronoms \| \|
	\| PROREL \| autres pronoms \| rel \|
	\| PROWH \| autres pronoms \| int \|
	\| U \| ? \| \|
	\| V \| verbe \| \|
	\| VIMP \| verbe imperatif \| \|
	\| VINF \| verbe infinitif \| \|
	\| VPP \| participe passé \| \|
	\| VPR \| participe présent \| \|
	\| VS \| subjonctif \| \|

	More information on the tags can be found here:

	http://alpage.inria.fr/statgram/frdep/Publications/crabbecandi-taln2008-final.pdf

	## Usage

	The usage of this model follows the common transformers patterns. Here is a short example of its usage:

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification

	tokenizer = AutoTokenizer.from_pretrained("gilf/french-camembert-postag-model")
	model = AutoModelForTokenClassification.from_pretrained("gilf/french-camembert-postag-model")

	from transformers import pipeline

	nlp_token_class = pipeline('ner', model=model, tokenizer=tokenizer, grouped_entities=True)

	nlp_token_class('Face à un choc inédit, les mesures mises en place par le gouvernement ont permis une protection forte et efficace des ménages')
	```

	The lines above would display something like this on a Jupyter notebook:

	```
	[{'entity_group': 'NC', 'score': 0.5760144591331482, 'word': '<s>'},
	{'entity_group': 'U', 'score': 0.9946700930595398, 'word': 'Face'},
	{'entity_group': 'P', 'score': 0.999615490436554, 'word': 'à'},
	{'entity_group': 'DET', 'score': 0.9995906352996826, 'word': 'un'},
	{'entity_group': 'NC', 'score': 0.9995531439781189, 'word': 'choc'},
	{'entity_group': 'ADJ', 'score': 0.999183714389801, 'word': 'inédit'},
	{'entity_group': 'P', 'score': 0.3710663616657257, 'word': ','},
	{'entity_group': 'DET', 'score': 0.9995903968811035, 'word': 'les'},
	{'entity_group': 'NC', 'score': 0.9995649456977844, 'word': 'mesures'},
	{'entity_group': 'VPP', 'score': 0.9988670349121094, 'word': 'mises'},
	{'entity_group': 'P', 'score': 0.9996246099472046, 'word': 'en'},
	{'entity_group': 'NC', 'score': 0.9995329976081848, 'word': 'place'},
	{'entity_group': 'P', 'score': 0.9996233582496643, 'word': 'par'},
	{'entity_group': 'DET', 'score': 0.9995935559272766, 'word': 'le'},
	{'entity_group': 'NC', 'score': 0.9995369911193848, 'word': 'gouvernement'},
	{'entity_group': 'V', 'score': 0.9993771314620972, 'word': 'ont'},
	{'entity_group': 'VPP', 'score': 0.9991101026535034, 'word': 'permis'},
	{'entity_group': 'DET', 'score': 0.9995885491371155, 'word': 'une'},
	{'entity_group': 'NC', 'score': 0.9995636343955994, 'word': 'protection'},
	{'entity_group': 'ADJ', 'score': 0.9991781711578369, 'word': 'forte'},
	{'entity_group': 'CC', 'score': 0.9991298317909241, 'word': 'et'},
	{'entity_group': 'ADJ', 'score': 0.9992275238037109, 'word': 'efficace'},
	{'entity_group': 'P+D', 'score': 0.9993300437927246, 'word': 'des'},
	{'entity_group': 'NC', 'score': 0.8353511393070221, 'word': 'ménages</s>'}]
	```