Edit model card

mbart-large-cc25-cnn-dailymail-xsum-nl

Model description

Finetuned version of mbart. We also wrote a blog post about this model here

Intended uses & limitations

It's meant for summarizing Dutch news articles.

How to use

import transformers

undisputed_best_model = transformers.MBartForConditionalGeneration.from_pretrained(
    "ml6team/mbart-large-cc25-cnn-dailymail-xsum-nl"
)
tokenizer = transformers.MBartTokenizer.from_pretrained("facebook/mbart-large-cc25")
summarization_pipeline = transformers.pipeline(
    task="summarization",
    model=undisputed_best_model,
    tokenizer=tokenizer,
)
summarization_pipeline.model.config.decoder_start_token_id = tokenizer.lang_code_to_id[
    "nl_XX"
]

article = "Kan je dit even samenvatten alsjeblief."  # Dutch
summarization_pipeline(
    article,
    do_sample=True,
    top_p=0.75,
    top_k=50,
    min_length=50,
    early_stopping=True,
    truncation=True,
)[0]["summary_text"]

Training data

Finetuned mbart with this dataset and this dataset

Downloads last month
57
Inference API
Examples
This model can be loaded on Inference API (serverless).

Datasets used to train ml6team/mbart-large-cc25-cnn-dailymail-xsum-nl