Text Generation
Transformers
Safetensors
lola_v1
custom_code
lola_v1 / README.md
neo-nlp-dev's picture
Update README.md
a970019 verified
|
raw
history blame
No virus
2.86 kB
metadata
library_name: transformers
license: cc-by-4.0
datasets:
  - uonlp/CulturaX

Model Card for Model ID

LOLA — An Open-Source Massively Multilingual Large Language Model

Model Description

* The number of parameters a model utilizes per token (ref: Du et al, 2022). This distinction is crucial for understanding the efficiency and performance of MoE models.

How to Get Started with the Model

This pre-trained (causal language modeling) model can only be used for text-generation and requires further fine-tuning on downstream tasks.

How to use

You can use this model directly with a pipeline for text generation.

>>> from transformers import pipeline

>>> generator = pipeline('text-generation', model="dice-research/lola_v1", trust_remote_code=True)
>>> generator("The quick brown fox", max_length=13)
[{'generated_text': 'The quick brown fox jumps over the lazy dog.'}]

To use the top-k sampling, please set do_sample to True.

Note: The tokenizer used in the model comes from mGPT (https://github.com/ai-forever/mgpt)

Training Details

Training Framework

Pretraining Dataset

LOLA v1 Training:

Citation

If you use our work in your research, please make sure to cite it:

@misc{srivastava2024lolaopensourcemassively,
      title={LOLA -- An Open-Source Massively Multilingual Large Language Model}, 
      author={Nikit Srivastava and Denis Kuchelev and Tatiana Moteu Ngoli and Kshitij Shetty and Michael Roeder and Diego Moussallem and Hamada Zahera and Axel-Cyrille Ngonga Ngomo},
      year={2024},
      eprint={2409.11272},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2409.11272}, 
}