LoicDL
/

robbertje-dutch-finetuned-snli

Text Classification

Inference Endpoints

Model card Files Files and versions Community

robbertje-dutch-finetuned-snli / README.md

LoicDL's picture

Update README.md

b84f56c verified 8 months ago

|

history blame contribute delete

3.17 kB

	# Monolingual Dutch Models for Zero-Shot Text Classification

	This family of Dutch models were finetuned on combined data from the (translated) [snli](https://nlp.stanford.edu/projects/snli/) and [SICK-NL](https://github.com/gijswijnholds/sick_nl) datasets. They are intended to be used in zero-shot classification for Dutch through Huggingface Pipelines.

	## The Models

	\| Base Model \| Huggingface id (fine-tuned) \|
	\|-------------------\|---------------------\|
	\| [BERTje](https://huggingface.co/GroNLP/bert-base-dutch-cased) \| LoicDL/bert-base-dutch-cased-finetuned-snli \|
	\| [RobBERT V2](http://github.com/iPieter/robbert) \| LoicDL/robbert-v2-dutch-finetuned-snli \|
	\| [RobBERTje](https://github.com/iPieter/robbertje) \| this model \|



	## How to use

	While this family of models can be used for evaluating (monolingual) NLI datasets, it's primary intended use is zero-shot text classification in Dutch. In this setting, classification tasks are recast as NLI problems. Consider the following sentence pairing that can be used to simulate a sentiment classification problem:

	- Premise: The food in this place was horrendous
	- Hypothesis: This is a negative review

	For more information on using Natural Language Inference models for zero-shot text classification, we refer to [this paper](https://arxiv.org/abs/1909.00161).

	By default, all our models are fully compatible with the Huggingface pipeline for zero-shot classification. They can be downloaded and accessed through the following code:


	```python
	from transformers import pipeline

	classifier = pipeline(
	task="zero-shot-classification",
	model='LoicDL/robbertje-dutch-finetuned-snli'
	)


	text_piece = "Het eten in dit restaurant is heel lekker."
	labels = ["positief", "negatief", "neutraal"]
	template = "Het sentiment van deze review is {}"

	predictions = classifier(text_piece,
	labels,
	multi_class=False,
	hypothesis_template=template
	)
	```


	## Model Performance


	### Performance on NLI task

	\| Model \| Accuracy [%] \| F1 [%] \|
	\|-------------------\|--------------------------\|--------------\|
	\| bert-base-dutch-cased-finetuned-snli \| 86.21 \| 86.42 \|
	\| robbert-v2-dutch-finetuned-snli \| 87.61 \| 88.02 \|
	\| robbertje-dutch-finetuned-snli \| 83.28 \| 84.11 \|



	### BibTeX entry and citation info

	If you would like to use or cite our paper or model, feel free to use the following BibTeX code:

	```bibtex
	@article{De Langhe_Maladry_Vanroy_De Bruyne_Singh_Lefever_2024,
	title={Benchmarking Zero-Shot Text Classification for Dutch},
	volume={13},
	url={https://www.clinjournal.org/clinj/article/view/172},
	journal={Computational Linguistics in the Netherlands Journal},
	author={De Langhe, Loic and Maladry, Aaron and Vanroy, Bram and De Bruyne, Luna and Singh, Pranaydeep and Lefever, Els and De Clercq, Orphée},
	year={2024},
	month={Mar.},
	pages={63–90} }
	```