SlovAlpaca
This repository contains the LORA weights finetuned on the translated version of the original Alpaca dataset (more info on the dataset card)
Training procedure
The training was done on the 7B LLaMA model (decapoda-research/llama-7b-hf) quantized to 8bits with the following Hyperparameters:
MICRO_BATCH_SIZE = 3
BATCH_SIZE = 128
GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
EPOCHS = 2 # paper uses 3
LEARNING_RATE = 2e-5 # from the original paper
CUTOFF_LEN = 256
LORA_R = 4
LORA_ALPHA = 16
LORA_DROPOUT = 0.05
The sole goal of this project is to explore the effects of single-language finetuning using the same dataset and methods as the original paper did and comapre the results
@misc{alpaca, author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto }, title = {Stanford Alpaca: An Instruction-following LLaMA model}, year = {2023}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}}, }
How to use:
Prerequisites
!pip install datasets loralib sentencepiece
!pip uninstall -y transformers
!pip install git+https://github.com/zphang/transformers@c3dc391#egg=transformers
!pip install git+https://github.com/huggingface/peft.git
!pip install bitsandbytes
Load model:
from peft import PeftModel
from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig
tokenizer = LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf")
model = LLaMAForCausalLM.from_pretrained(
"decapoda-research/llama-7b-hf",
load_in_8bit=True,
device_map="auto",
)
model = PeftModel.from_pretrained(model, "blip-solutions/SlovAlpaca")
Generation
Here is a colab notebook for inference: https://colab.research.google.com/drive/1z4aMG7tGjchLBlg_iXDuqt3sH6bQRuQk?usp=sharing
PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Kde žijú lamy?
### Response:"""
inputs = tokenizer(
PROMPT,
return_tensors="pt",
)
input_ids = inputs["input_ids"].cuda()
generation_config = GenerationConfig(
temperature=0.6,
top_p=0.95,
repetition_penalty=1.15,
)
print("Generating...")
generation_output = model.generate(
input_ids=input_ids,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=128,
)
for s in generation_output.sequences:
print(tokenizer.decode(s))
Response:
Generating...
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Kde žijú lamy?
### Response:
Lamy žiju v horách, na poli, alebo v lesoch.