LLaMAX/LLaMAX2-7B-XNLI · Hugging Face

Model Sources

Paper: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
Link: https://arxiv.org/pdf/2407.05975
Repository: https://github.com/CONE-MT/LLaMAX/

Model Description

🔥 LLaMAX-7B-X-NLI is a NLI model with multilingual capability, which is fully fine-tuned the powerful multilingual model LLaMAX-7B on MultiNLI dataset.

🔥 Compared with fine-tuning Llama-2 on the same setting, LLaMAX-7B-X-CSQA improves the average accuracy up to 5.6% on the XNLI dataset.

Experiments

XNLI	Avg.	Sw	Ur	Hi	Th	Ar	Tr	El	Vi	Zh	Ru	Bg	De	Fr	Es	En
Llama2-7B-X-XNLI	70.6	44.6	55.1	62.2	58.4	64.7	64.9	65.6	75.4	75.9	78.9	78.6	80.7	81.7	83.1	89.5
LLaMAX-7B-X-XNLI	76.2	66.7	65.3	69.1	66.2	73.6	71.8	74.3	77.4	78.3	80.3	81.6	82.2	83.0	84.1	89.7

Model Usage

Code Example:

from transformers import AutoTokenizer, LlamaForCausalLM

model = LlamaForCausalLM.from_pretrained(PATH_TO_CONVERTED_WEIGHTS)
tokenizer = AutoTokenizer.from_pretrained(PATH_TO_CONVERTED_TOKENIZER)

query = "Premise: She doesn’t really understand. Hypothesis: Actually, she doesn’t get it. Label:"
inputs = tokenizer(query, return_tensors="pt")

generate_ids = model.generate(inputs.input_ids, max_length=30)
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
# =>  Entailment

Citation

if our model helps your work, please cite this paper:

@article{lu2024llamax,
  title={LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages},
  author={Lu, Yinquan and Zhu, Wenhao and Li, Lei and Qiao, Yu and Yuan, Fei},
  journal={arXiv preprint arXiv:2407.05975},
  year={2024}
}