Edit model card

Model Card for fr_en-t5-large

This model has been optimized for French and English language processing while minimizing overall size. To achieve this, I only retained relevant parameters and tokens specific to these two languages, ensuring that performance remains as good as the original mt5.

Model Details

I used a method outlined in a blog post by David Dale to downsize the multilingual T5 model for French and English use cases specifically. By utilizing the giga_fren dataset, I was able to successfully reduce the total number of tokens and decrease both the model and tokenizer sizes by 38% and 80% respectively.

Model Description

  • Developed by: Korventenn
  • Model type: mt5
  • Language(s) (NLP): French and English
  • License: Apache 2.0
  • Generated from model: mt5-large

Model Sources [optional]

Uses

You can use the raw model for any sequence to sequence task that is focused on either french, english or both.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("Korventenn/fr_en-t5-large")

model = AutoModelForSeq2SeqLM.from_pretrained("Korventenn/fr_en-t5-large")

Training Data

giga_fren

Downloads last month
13
Safetensors
Model size
810M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Korventenn/fr_en-t5-large