@anakin87 on Hugging Face: "💬 🇮🇹 Phi 3.5 mini ITA: a Small Language Model for Italian Lately, I've…"

Hugging Face

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Back to feed

anakin87

posted an update Aug 29

Post

1623

💬 🇮🇹 Phi 3.5 mini ITA: a Small Language Model for Italian

Lately, I've spent some time fine-tuning language models.

Now I am happy to release Phi 3.5 mini ITA: a fine-tuned version of Phi-3.5-mini-instruct to improve performance on the Italian language

🔹 Small (3.82 B parameters) but capable model
🔹 128k context length

Chat with it on 🤗 Spaces: anakin87/Phi-3.5-mini-ITA
Model card: anakin87/Phi-3.5-mini-ITA

🗃️ Data
Supervised fine-tuning using a good mix of English and Italian data:
- mlabonne/FineTome-100k by @mlabonne
- efederici/capybara-claude-15k-ita by @efederici
🙏 Thanks to the authors for the datasets.

🎯 Targeted training with Spectrum
I used Spectrum, a relatively new technique for parameter-efficient learning.
The idea is to train only the layers of the model with high Signal-to-Noise Ratio (SNR) and ❄️ freeze the rest.
I trained the top 30% of model layers.

📝 Spectrum paper: https://arxiv.org/abs/2406.06623

📊 Vibe check and performance on Italian benchmarks seem encouraging

anakin87

Aug 29

This comment has been hidden

mlabonne

Aug 29

Nice work, congrats!

In this post