manu
/

lilt-camembert-base

Token Classification

liltrobertalike

Inference Endpoints

Model card Files Files and versions Community

lilt-camembert-base / README.md

manu's picture

Update README.md

3e05da9 over 2 years ago

|

history blame contribute delete

1.65 kB

	---
	language:
	- fr
	tags:
	- token-classification
	- fill-mask
	license: mit
	datasets:
	- iit-cdip
	---


	This model is the combined camembert-base model, with the pretrained lilt checkpoint from the paper "LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding".

	Original repository: https://github.com/jpWang/LiLT

	To use it, it is necessary to fork the modeling and configuration files from the original repository, and load the pretrained model from the corresponding classes (LiLTRobertaLikeConfig, LiLTRobertaLikeForRelationExtraction, LiLTRobertaLikeForTokenClassification, LiLTRobertaLikeModel).
	They can also be preloaded with the AutoConfig/model factories as such:

	```python
	from transformers import AutoModelForTokenClassification, AutoConfig

	from path_to_custom_classes import (
	LiLTRobertaLikeConfig,
	LiLTRobertaLikeForRelationExtraction,
	LiLTRobertaLikeForTokenClassification,
	LiLTRobertaLikeModel
	)


	def patch_transformers():
	AutoConfig.register("liltrobertalike", LiLTRobertaLikeConfig)
	AutoModel.register(LiLTRobertaLikeConfig, LiLTRobertaLikeModel)
	AutoModelForTokenClassification.register(LiLTRobertaLikeConfig, LiLTRobertaLikeForTokenClassification)
	# etc...
	```

	To load the model, it is then possible to use:
	```python
	# patch_transformers() must have been executed beforehand

	tokenizer = AutoTokenizer.from_pretrained("camembert-base")
	model = AutoModel.from_pretrained("manu/lilt-camembert-base")
	model = AutoModelForTokenClassification.from_pretrained("manu/lilt-camembert-base") # to be fine-tuned on a token classification task
	```