Papers
arxiv:2305.15011

Bactrian-X : A Multilingual Replicable Instruction-Following Model with Low-Rank Adaptation

Published on May 24, 2023
Authors:
,
,
,

Abstract

Instruction tuning has shown great promise in the field of natural language processing. However, the research on multilingual instruction tuning has been limited due to the scarcity of high-quality instruction-response datasets. To address this gap, we present Bactrian-X, a comprehensive multilingual parallel dataset of 3.4 million instruction-response pairs across 52 languages. Leveraging this dataset, we train a set of adapters using low-rank adaptation (LoRA), which are lightweight components seamlessly integrated with foundational models. These adapters have a significantly smaller parameter count than the base model, making them easily replaceable and usable as plug-ins for different languages or language groups. Through extensive experiments on 52 languages, we demonstrate the superior performance of our models in various multilingual evaluation settings. Our proposed models outperform both the vanilla models and the existing instruction-tuned models. The code and models are publicly available at https://github.com/mbzuai-nlp/bactrian-x.

Community

Sign up or log in to comment

Models citing this paper 59

Browse 59 models citing this paper

Datasets citing this paper 2

Spaces citing this paper 24

Collections including this paper 3