arxiv:2401.13303

MaLA-500: Massive Language Adaptation of Large Language Models

Published on Jan 24

· Submitted by

akhaliq on Jan 25

Upvote

Authors:

Peiqin Lin ,

Shaoxiong Ji ,

Jörg Tiedemann ,

André F. T. Martins ,

Abstract

Large language models have advanced the state of the art in natural language processing. However, their predominant design for English or a limited set of languages creates a substantial gap in their effectiveness for low-resource languages. To bridge this gap, we introduce MaLA-500, a novel large language model designed to cover an extensive range of 534 languages. To train MaLA-500, we employ vocabulary extension and continued pretraining on LLaMA 2 with Glot500-c. Our experiments on SIB-200 show that MaLA-500 achieves state-of-the-art in-context learning results. We release MaLA-500 at https://huggingface.co/MaLA-LM

View arXiv page View PDF Add to collection