--- language: - en license: mit tags: - english --- This is a version of the [google/mt5-base](https://huggingface.co/google/mt5-base) model only for English with some embeddings left. * Using `sentencepiece` vocabulary was shrinking from 250K to 20K (top 20K English tokens) the number of model parameters reduced to 244M parameters, and model size reduced from 2.2GB to 0.9GB - 39% of the original one. Approach was taken from article: [How to adapt a multilingual T5 model for a single language](https://cointegrated.medium.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90).