ul2-base-dutch-simplification-mai-2023

This model is intended to simplify Dutch sentences.

This model is a fine-tuned version of yhavinga/ul2-base-dutch on the BramVanroy/chatgpt-dutch-simplification dataset.

The model was created in light of the master thesis of Charlotte Van de Velde in the Master of Science in Artificial Intelligence (MAI) at KU Leuven in 2023. Charlotte is supervised by Vincent Vandeghinste and Bram Vanroy. Dataset creation by Charlotte, model training by Bram.

Quick links

Repository: includes training code and model creation log
Dataset: BramVanroy/chatgpt-dutch-simplification
Parent model: this model was finetuned on yhavinga/ul2-base-dutch
Demo: shows the this model in action (don't rely on the "Hosted inference API" widget on this page, it does not work very well)

Intended uses & limitations, and dataset

The model is intended for sentence-level simplification of Dutch. It might extend to document-level simplification but most of the dataset is limited to sentences so document-level performance is not guaranteed.

The dataset has been generated automatically (cf. dataset description) and has not been manually verified. On top of that, this model has been fine-tuned and we did not scrutinize the parent model or its training data. Output of the current model is therefore subject to unexpected results (as most if not all neural networks).

Because the dataset was generated with ChatGPT, this model cannot be used for commercial purposes.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.00026885245616406115
train_batch_size: 12
optimizer: Adafactor
num_epochs: 26

These hyperarameters were found through Bayesian hyperparameter search with wandb. This is described in the repository.

Training results

eval results are on the evaluation set, predict results are on the test set. These were achieved with beam search (num_beams=3).

{
    "eval_gen_len": 21.206349206349206,
    "eval_loss": 2.598172903060913,
    "eval_rouge1": 41.5749,
    "eval_rouge2": 19.9,
    "eval_rougeL": 36.3204,
    "eval_rougeLsum": 36.2596,
    "eval_sari": 53.0091,
  
    "predict_gen_len": 22.40625,
    "predict_loss": 2.517918586730957,
    "predict_rouge1": 44.2877,
    "predict_rouge2": 20.8132,
    "predict_rougeL": 39.0951,
    "predict_rougeLsum": 39.2709,
    "predict_sari": 52.9621
}

Framework versions

Transformers 4.29.2
Pytorch 2.0.1+cu117
Datasets 2.12.0
Tokenizers 0.13.3

Downloads last month: 11

Safetensors

Model size

248M params

Tensor type

F32

Inference Examples

Text2Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for BramVanroy/ul2-base-dutch-simplification-mai-2023

Base model

yhavinga/ul2-base-dutch

Finetuned

(2)

this model

Dataset used to train BramVanroy/ul2-base-dutch-simplification-mai-2023

Space using BramVanroy/ul2-base-dutch-simplification-mai-2023 1

Collection including BramVanroy/ul2-base-dutch-simplification-mai-2023

Dutch Simplification

Collection

7 items • Updated Apr 27

Evaluation results

Eval Rouge-1 on ChatGPT Dutch Simplification
self-reported

41.575
Eval Rouge-2 on ChatGPT Dutch Simplification
self-reported

19.900
Eval RougeL on ChatGPT Dutch Simplification
self-reported

36.320
Eval RougeLsum on ChatGPT Dutch Simplification
self-reported

36.260
Eval SARI on ChatGPT Dutch Simplification
self-reported

53.009
Test Rouge-1 on ChatGPT Dutch Simplification
self-reported

44.288
Test Rouge-2 on ChatGPT Dutch Simplification
self-reported

20.813
Test RougeL on ChatGPT Dutch Simplification
self-reported

39.095
Test RougeLsum on ChatGPT Dutch Simplification
self-reported

39.271
Test SARI on ChatGPT Dutch Simplification
self-reported

52.962

View on Papers With Code