blip-solutions
/

SlovAlpaca-lora

Model card Files Files and versions Community

SlovAlpaca-lora / README.md

ju-bezdek's picture

Update README.md

07693ae over 1 year ago

|

2.93 kB

	---
	license: other
	datasets:
	- blip-solutions/SlovAlpaca
	language:
	- sk
	---

	# SlovAlpaca

	This repository contains the LORA weights finetuned on the translated version of the original Alpaca dataset (more info on the dataset card)

	## Training procedure

	The training was done on the 7B LLaMA model (decapoda-research/llama-7b-hf) quantized to 8bits with the following Hyperparameters:

	```
	MICRO_BATCH_SIZE = 3
	BATCH_SIZE = 128
	GRADIENT_ACCUMULATION_STEPS = BATCH_SIZE // MICRO_BATCH_SIZE
	EPOCHS = 2 # paper uses 3
	LEARNING_RATE = 2e-5 # from the original paper
	CUTOFF_LEN = 256
	LORA_R = 4
	LORA_ALPHA = 16
	LORA_DROPOUT = 0.05
	```

	The sole goal of this project is to explore the effects of single-language finetuning using the same dataset and methods as the original paper did and comapre the results

	@misc{alpaca,
	author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },
	title = {Stanford Alpaca: An Instruction-following LLaMA model},
	year = {2023},
	publisher = {GitHub},
	journal = {GitHub repository},
	howpublished = {\url{https://github.com/tatsu-lab/stanford_alpaca}},
	}

	## How to use:

	### Prerequisites
	```
	!pip install datasets loralib sentencepiece
	!pip uninstall -y transformers
	!pip install git+https://github.com/zphang/transformers@c3dc391#egg=transformers
	!pip install git+https://github.com/huggingface/peft.git
	!pip install bitsandbytes
	```


	### Load model:

	```
	from peft import PeftModel
	from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig

	tokenizer = LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf")

	model = LLaMAForCausalLM.from_pretrained(
	"decapoda-research/llama-7b-hf",
	load_in_8bit=True,
	device_map="auto",
	)

	model = PeftModel.from_pretrained(model, "blip-solutions/SlovAlpaca")
	```

	### Generation

	Here is a colab notebook for inference: https://colab.research.google.com/drive/1z4aMG7tGjchLBlg_iXDuqt3sH6bQRuQk?usp=sharing

	```
	PROMPT = """Below is an instruction that describes a task. Write a response that appropriately completes the request.
	### Instruction:
	Kde žijú lamy?
	### Response:"""

	inputs = tokenizer(
	PROMPT,
	return_tensors="pt",
	)
	input_ids = inputs["input_ids"].cuda()

	generation_config = GenerationConfig(
	temperature=0.6,
	top_p=0.95,
	repetition_penalty=1.15,
	)
	print("Generating...")
	generation_output = model.generate(
	input_ids=input_ids,
	generation_config=generation_config,
	return_dict_in_generate=True,
	output_scores=True,
	max_new_tokens=128,
	)
	for s in generation_output.sequences:
	print(tokenizer.decode(s))
	```

	### Response:

	```
	Generating...
	Below is an instruction that describes a task. Write a response that appropriately completes the request.
	### Instruction:
	Kde žijú lamy?
	### Response:
	Lamy žiju v horách, na poli, alebo v lesoch.
	```