LLaMA 3-8B Instruct - Galician Language Model
This repository hosts a variation of LLaMA 3-8B Instruct model that has been fine-tuned to understand and generate the Galician language. This model is built on Meta's LLaMA (Large Language Model Architecture) and has been fine-tuned using LLama-Factory advanced training tools to enhance its capabilities specifically for the Galician language.
Model Description
The LLaMA 3-8B Instruct model is one of the advanced versions of Meta's Large Language Model Architecture, initially designed to understand and generate human-like text across various languages. However, its proficiency in Galician is limited. This specific model has been fine-tuned to specialize in the Galician language, aiming to provide more accurate and contextually appropriate responses in Galician. This model is the result of merging the Unsloth Llama3-8B-Instruct model and the abrahammg/Llama3-8B-Galician-Chat-Lora Lora Adapter
Technical Details
- Base Model: Unsloth Meta's LLaMA 3 8B Instruct (https://huggingface.co/unsloth/llama-3-8b-Instruct-bnb-4bit)
- Fine-Tuning Platform: LLaMA Factory
- Infrastructure: Finisterrae III Supercomputer, CESGA (Galicia-Spain)
- Dataset: irlab-udc/alpaca_data_galician (with modifications)
- Fine-Tuning Objective: To improve text comprehension and generation in Galician.
Trainning parameters
The project is still in the testing phase, and the training parameters will continue to vary to find the values that result in a more accurate model. Currently, the model is trained with a set of entries from the dataset and the following values:
- num_train_epochs=3.0
- finetuning_type="lora"
- per_device_train_batch_size=2
- gradient_accumulation_steps=4
- lr_scheduler_type="cosine"
- learning_rate=5e-5
- max_grad_norm=1.0
Model tests
Below, we have several examples of question-answer pairs comparing the pure Llama3 model with the Llama3 model fine tunned for the Galician language.
Llama3-8B 👎
User: Cal é a capital de Canadá?
Assistant: Não! Cal não é a capital de Canadá. A capital de Canadá é Ottawa.
Llama3-8B-Galician-Chat-Lora ✅ 👍
User: Cal é a capital de Canadá?
Assistant: A capital de Canadá é Ottawa.
Llama3-8B 👎
User: Cantos habitantes ten Galicia?
Assistant: Según el Instituto Nacional de Estadística (INE) de España, la población residente en Galicia al 1 de enero de 2021 era de aproximadamente 2.79 millones de personas.
Llama3-8B-Galician-Chat-Lora ✅ 👍
User: Cantos habitantes ten Galicia?
Assistant: Segundo as últimas estimacións, Galicia ten uns 2,8 millóns de habitantes.
Using the Model 💬
To utilize this model within LM Studio, simply input the URL https://huggingface.co/abrahammg/Llama3-8B-Galician-Instruct-GGUF into the search box. For the best performance, ensure you set the template to LLama3.
Or pull it in Ollama with the command:
ollama run abrahammg/llama3-gl-chat
Lora Adapter
There is also a Lora Adapter for Llama3-8B-Instruct models in this repo https://huggingface.co/abrahammg/Llama3-8B-Galician-Chat-Lora
Acknowledgement
- Downloads last month
- 7