---
license: apache-2.0
datasets:
- yahma/alpaca-cleaned
language:
- en
library_name: transformers
---
# Instruction Tuned GPTJ LoRA using the Alpaca Dataset

The repo with all the training code can be found at https://github.com/entrpn/serving-model-cards/tree/main/instruct-tune-gptj

This model contains the LoRA weights for GPTJ-6B. The model was fine tuned on a T4 GPU using 8-bit quantization, gradient checkpointing and gradient accumulation.

This was trained in a Google Cloud Platform Compute Engine spot VM for 3k steps, costing less than $2 dollars.

The license should follow the same as [Stanford Alpaca](https://crfm.stanford.edu/2023/03/13/alpaca.html). However, you should be able to use this method with your own dataset and not have the same restrictions.

Also want to shout out to @tloen as I used his some of his code to generate the prompts for training and inference. Please check out the author's repo https://github.com/tloen/alpaca-lora

## Generations

First let's look at some results

```text
Instruction : How can I make friends?
Response : The best way to make friends is to be friendly and approachable. Show interest in other people by asking them about their lives, interests, and hobbies. Make an effort to get to know people better by inviting them to social events and inviting them to join you in activities. Make an effort to get to know people better by asking them about their lives, interests, and hobbies. Show interest in other people by asking them about their lives, interests, and hobbies. Make an effort to get to know people better by inviting them to social events and inviting them to join you in activities. Make an effort to get to know people better by
```

```text
Instruction : Generate an example of what a resume should like for an engineering professional
Response : An example of a resume for an engineering professional should include the following information: name, contact information, education, professional experience, skills, and references. The education section should include a list of degrees, certifications, and professional experience. The professional experience section should include a list of projects, responsibilities, and accomplishments. The skills section should include a list of technical and non-technical skills. Finally, the references section should include a list of professional references.
```

```text
Instruction : Tell me about lions.
Response : Lions are one of the largest cats in the world. They are found in Africa, Asia, and parts of the Middle East. They are known for their powerful jaws, sharp claws, and large manes. They are also known for their ability to roar, which can be heard up to a mile away. They are also known for their ability to hunt and kill large prey such as antelopes, zebras, and wildebeests. They are also known for their ability to defend their territory and their cubs. They are also known for their ability to roar, which can be heard up to a mile away
```

```text
Instruction : What are alcapas? How are alpacas different from llamas?
Response : Alpacas and llamas are both members of the camelid family. Alpacas are smaller than llamas and have shorter legs. Alpacas are native to South America, while llamas are native to the Andes Mountains of South America. Alpacas and llamas are similar in appearance, but alpacas are smaller and have shorter legs than llamas. Alpacas and llamas are both used as pack animals, but alpacas are smaller and have shorter legs than llamas. Alpacas and llamas are both used for milk production, but alpacas are smaller and have shorter legs than
```

```text
Instruction: What is a neural network?
Response: A neural network is a type of artificial intelligence (AI) system that is inspired by the structure and function of the human brain. It is a type of machine learning algorithm that is used to learn from data and make predictions. Neural networks can be used for a variety of tasks, such as image recognition, speech recognition, natural language processing, and computer vision. They can also be used to solve problems in areas such as robotics, autonomous vehicles, and medical diagnosis. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [
```

## Examples

Install dependencies.

```shell
conda create -n py39 python=3.9
conda activate py39
pip install transformers tensorboard accelerate datasets torch bitsandbytes git+https://github.com/huggingface/peft.git
```

To generate, use the following

```python
import torch
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

def generate_prompt(instruction, input=None):
if input:
return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input}
### Response:"""
else:
return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Response:"""

def evaluate(
model,
tokenizer,
instruction,
input=None,
temperature=0.1,
top_p=0.75,
top_k=40,
num_beams=4,
max_new_tokens=128,
**kwargs,
):
prompt = generate_prompt(instruction, input)
inputs = tokenizer(prompt, return_tensors="pt")
input_ids = inputs["input_ids"].to("cuda")
generation_config = GenerationConfig(
temperature=temperature,
top_p=top_p,
top_k=top_k,
num_beams=num_beams,
**kwargs,
)
with torch.no_grad():
generation_output = model.generate(
input_ids=input_ids,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=max_new_tokens,
)
s = generation_output#.sequences[0]
s = s.sequences[0]
output = tokenizer.decode(s)
return output.split("### Response:")[1].strip()

peft_model_id = "jfacevedo/gptj-alpaca"
model_name = "EleutherAI/gpt-j-6B"
config = PeftConfig.from_pretrained(peft_model_id)
print("loading model")
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto",revision="float16", load_in_8bit=True)
print("loading tokenizer")
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Padding token should not be required for inference, but adding it since it was added during training
# Add pad token
new_tokens = [""]
# check if the tokens are already in the vocabulary
new_tokens = set(new_tokens) - set(tokenizer.vocab.keys())
# add the tokens to the tokenizer vocabulary
tokenizer.add_tokens(list(new_tokens))
# add new, random embeddings for the new tokens
model.resize_token_embeddings(len(tokenizer))
tokenizer.pad_token = ""

# Load the Lora model

instruction = "Describe the structure of an atom."
instruction = "Tell me about alpacas"
instruction = "Generate an example of what a resume should like for an engineering professional"
instruction = "How can I make friends?"
print("instruction: ",instruction)

# Testing the original model
# print("Not finetuned")
# print("Response:", evaluate(model, tokenizer, instruction))
# print("\n\n")

print("loading lora model")
model = PeftModel.from_pretrained(model, peft_model_id).to("cuda")

print("Finetuned model")
print("Response:", evaluate(model, tokenizer, instruction))
```