Model Card for fr_en-t5-large
This model has been optimized for French and English language processing while minimizing overall size. To achieve this, I only retained relevant parameters and tokens specific to these two languages, ensuring that performance remains as good as the original mt5.
Model Details
I used a method outlined in a blog post by David Dale to downsize the multilingual T5 model for French and English use cases specifically. By utilizing the giga_fren dataset, I was able to successfully reduce the total number of tokens and decrease both the model and tokenizer sizes by 38% and 80% respectively.
Model Description
- Developed by: Korventenn
- Model type: mt5
- Language(s) (NLP): French and English
- License: Apache 2.0
- Generated from model: mt5-large
Model Sources [optional]
- Repository: https://colab.research.google.com/drive/1cDWtO5BqWMm_nxnM7lHmPEKMWMejHdBJ#scrollTo=s6ebzRxA1VGv
Uses
You can use the raw model for any sequence to sequence task that is focused on either french, english or both.
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("Korventenn/fr_en-t5-large")
model = AutoModelForSeq2SeqLM.from_pretrained("Korventenn/fr_en-t5-large")
Training Data
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.