File size: 2,931 Bytes

---
license: other
datasets:
- blip-solutions/SlovAlpaca
language:
- sk
---

# SlovAlpaca

This repository contains the LORA weights finetuned on the translated version of the original Alpaca dataset (more info on the dataset card)

## Training procedure

The training was done on the 7B LLaMA model (decapoda-research/llama-7b-hf) quantized to 8bits with the following Hyperparameters:

```
MICRO_BATCH_SIZE = 3 
BATCH_SIZE = 128
GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
EPOCHS = 2  # paper uses 3
LEARNING_RATE = 2e-5  # from the original paper
CUTOFF_LEN = 256  
LORA_R = 4
LORA_ALPHA = 16
LORA_DROPOUT = 0.05
```

The sole goal of this project is to explore the effects of single-language finetuning using the same dataset and methods as the original paper did and comapre the results

@misc{alpaca,
  author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
  title = {Stanford Alpaca: An Instruction-following LLaMA model},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
}

## How to use:

### Prerequisites
```
!pip install datasets loralib sentencepiece
!pip uninstall -y transformers
!pip install git+https://github.com/zphang/transformers@c3dc391#egg=transformers 
!pip install git+https://github.com/huggingface/peft.git
!pip install bitsandbytes
```


### Load model:

```
from peft import PeftModel
from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig

tokenizer = LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf")

model = LLaMAForCausalLM.from_pretrained(
    "decapoda-research/llama-7b-hf",
    load_in_8bit=True,
    device_map="auto",
)

model = PeftModel.from_pretrained(model, "blip-solutions/SlovAlpaca")
```

### Generation

Here is a colab notebook for inference: https://colab.research.google.com/drive/1z4aMG7tGjchLBlg_iXDuqt3sH6bQRuQk?usp=sharing

```
PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Kde žijú lamy?
### Response:"""

inputs = tokenizer(
    PROMPT,
    return_tensors="pt",
)
input_ids = inputs["input_ids"].cuda()

generation_config = GenerationConfig(
    temperature=0.6,
    top_p=0.95,
    repetition_penalty=1.15,
)
print("Generating...")
generation_output = model.generate(
    input_ids=input_ids,
    generation_config=generation_config,
    return_dict_in_generate=True,
    output_scores=True,
    max_new_tokens=128,
)
for s in generation_output.sequences:
    print(tokenizer.decode(s))
```

### Response:

```
Generating...
 Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Kde žijú lamy?
### Response:
Lamy žiju v horách, na poli, alebo v lesoch.
```