--- library_name: transformers license: llama2 datasets: - uonlp/CulturaX language: - uk - en pipeline_tag: text-generation --- # Llama-2-7b-Ukrainian ## Model Details ### Model Description Llama-2-7b-Ukrainian is a bilingual pre-trained model supporting Ukrainian and English. Continued pre-training from [Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b-hf) on 5B tokens consisting of 75% Ukrainian documents and 25% English documents from [CulturaX](https://huggingface.co/datasets/uonlp/CulturaX). **Paper:** [To Err Is Human, but Llamas Can Learn It Too](https://arxiv.org/abs/2403.05493) ### Training Hyperparameters | Hyperparameter | Value | |---|---| | Training steps | 19080 | | Batch size | 256 | | Weight decay | 0.1 | | Context length | 1024 | | Learning rate | 2e-5 linear decay to 2e-6 | | Precision | bf16 | | Optimizer | AdamW | ## Citation **BibTeX:** ``` @article{luhtaru2024err, title={To Err Is Human, but Llamas Can Learn It Too}, author={Luhtaru, Agnes and Purason, Taido and Vainikko, Martin and Del, Maksym and Fishel, Mark}, journal={arXiv preprint arXiv:2403.05493}, year={2024} } ```