File size: 5,222 Bytes

7729017
 
70205df
 
 
 
 
7729017
70205df
708cd5f
70205df
708cd5f
70205df
708cd5f
70205df
d6aac28
70205df
 
 
01da337
70205df
708cd5f
70205df
 
 
9dadabc
 
 
 
 
 
 
 
 
 
 
 
d6aac28
769d8e7
 
 
ae839da
769d8e7
 
 
 
d6aac28
769d8e7
 
 
 
ae839da
f39b6ca
 
 
 
d6aac28
f39b6ca
 
 
 
769d8e7
708cd5f
70205df
d6aac28
70205df
 
708cd5f
 
c2a56ad
d6aac28
c2a56ad
708cd5f
70205df
1e35d17
43166cf
1d5405d
d6aac28
1d5405d
708cd5f
 
1d5405d
547548c
 
 
 
 
d6aac28
 
 
 
547548c
 
 
 
 
 
 
 
1d5405d
547548c
 
 
 
 
1d5405d
708cd5f
547548c
 
708cd5f
547548c
 
 
708cd5f
1d5405d
547548c
43166cf
708cd5f
 
 
 
 
98db191
73eada0
98db191
73eada0
98db191
73eada0
 
 
 
 
934ec51
 
b8efbc1
 
708cd5f

---
license: apache-2.0
datasets:
- irlab-udc/alpaca_data_galician
language:
- gl
- en
---

# Llama3-8B Lora adapter for Galician language

This repository houses a specialized LoRA (Low-Rank Adaptation) Adapter designed specifically for fine-tuning Meta's LLaMA 3-8B Instruct version for applications involving the Galician language. The purpose of this adapter is to efficiently adapt the pre-trained model, which has been initially trained on a broad range of data and languages, to better understand and generate text in Galician.

## Adapter Description

This Lora Adapter has been specifically fine-tuned to understand and generate text in Galician. It was refined using a modified version of the [irlab-udc/alpaca_data_galician](https://huggingface.co/datasets/irlab-udc/alpaca_data_galician) dataset, enriched with synthetic data to enhance its text generation and comprehension capabilities in specific contexts.

### Technical Details

- **Base Model**: Unsloth Meta's LLaMA 3 8B Instruct (https://huggingface.co/unsloth/llama-3-8b-Instruct-bnb-4bit)
- **Fine-Tuning Platform**: LLaMA Factory
- **Infrastructure**: Finisterrae III Supercomputer, CESGA (Galicia-Spain)
- **Dataset**: [irlab-udc/alpaca_data_galician](https://huggingface.co/datasets/irlab-udc/alpaca_data_galician) (with modifications)
- **Fine-Tuning Objective**: To improve text comprehension and generation in Galician.

### Trainning parameters

The project is still in the testing phase, and the training parameters will continue to vary to find the values that result in a more accurate model. Currently, the model is trained with a set of **5000 random entries** from the dataset and the following values:

- num_train_epochs=3.0
- finetuning_type="lora"
- per_device_train_batch_size=2
- gradient_accumulation_steps=4
- lr_scheduler_type="cosine"
- learning_rate=5e-5
- max_grad_norm=1.0

## Testing The Adapter

Below, we have several examples of question-answer pairs comparing the pure Llama3 model with the Llama3 model fine tunned for the Galician language.

*Llama3-8B* 👎
```bash
User: Cal é a capital de Canadá?
Assistant: Não! Cal não é a capital de Canadá. A capital de Canadá é Ottawa.
```
*Llama3-8B-Galician-Chat-Lora* ✅ 👍
```bash
User: Cal é a capital de Canadá?
Assistant: A capital de Canadá é Ottawa.
```
*Llama3-8B* 👎
```bash
User: Cantos habitantes ten Galicia?
Assistant: Según el Instituto Nacional de Estadística (INE) de España, la población residente en Galicia al 1 de enero de 2021 era de aproximadamente 2.79 millones de personas.
```
*Llama3-8B-Galician-Chat-Lora* ✅ 👍
```bash
User: Cantos habitantes ten Galicia?
Assistant: Segundo as últimas estimacións, Galicia ten uns 2,8 millóns de habitantes.
```

## How to Use the Adapter

To use this adapter, follow the example code provided below. Ensure you have the necessary libraries installed (e.g., Hugging Face's `transformers`).

### Installation

Download de adapter from huggingface:
```bash
git clone https://huggingface.co/abrahammg/Llama3-8B-Galician-Chat-Lora
```
Install dependencies:
```bash
pip install transformers bitsandbytes "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" llmtuner xformers
```

### Run the adapter

Create a python script (ex. run_model.py):

```bash
from llmtuner import ChatModel
from llmtuner.extras.misc import torch_gc

chat_model = ChatModel(dict(
  model_name_or_path="unsloth/llama-3-8b-Instruct-bnb-4bit", # use bnb-4bit-quantized Llama-3-8B-Instruct model
  adapter_name_or_path="./",            # load Llama3-8B-Galician-Chat-Lora adapter
  finetuning_type="lora",               
  template="llama3",                    
  quantization_bit=4,                   # load 4-bit quantized model
  use_unsloth=True,                     # use UnslothAI's LoRA optimization for 2x faster generation
))

messages = []
while True:
  query = input("\nUser: ")
  if query.strip() == "exit":
    break

  if query.strip() == "clear":
    messages = []
    torch_gc()
    print("History has been removed.")
    continue

  messages.append({"role": "user", "content": query})
  print("Assistant: ", end="", flush=True)
  response = ""
  for new_text in chat_model.stream_chat(messages):
    print(new_text, end="", flush=True)
    response += new_text
  print()
  messages.append({"role": "assistant", "content": response})

torch_gc()
```
and run it
```bash
python run_model.py
```

# Full Merged Model 💬

You can find a the adapter merged with the Llama3-8B base model in this repo: [https://huggingface.co/abrahammg/Llama3-8B-Galician-Instruct-GGUF](https://huggingface.co/abrahammg/Llama3-8B-Galician-Instruct-GGUF)

To utilize this model within LM Studio, simply input the URL https://huggingface.co/abrahammg/Llama3-8B-Galician-Instruct-GGUF into the search box. For the best performance, ensure you set the template to LLama3.
Or pull it in **Ollama** with the command:
```bash
ollama run abrahammg/llama3-gl-chat
``` 

## Acknowledgement

- [meta-llama/llama3](https://github.com/meta-llama/llama3)
- [hiyouga/LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory)
- [irlab-udc/alpaca_data_galician](https://huggingface.co/datasets/irlab-udc/alpaca_data_galician)