jfacevedo
/

alpaca-gptj

 language:
 - en
 library_name: transformers
+---
+# Instruction Tuned GPTJ LoRA using the Alpaca Dataset
+The repo with all the training code can be found at https://github.com/entrpn/serving-model-cards/tree/main/instruct-tune-gptj
+This model contains the LoRA weights for GPTJ-6B. The model was fine tuned on a T4 GPU using 8-bit quantization, gradient checkpointing and gradient accumulation.
+This was trained in a Google Cloud Platform Compute Engine spot VM for 3k steps, costing less than $2 dollars.
+The license should follow the same as [Stanford Alpaca](https://crfm.stanford.edu/2023/03/13/alpaca.html). However, you can use this method with your own dataset and not have the same restrictions.
+Also want to shout out to @tloen as I used his code to generate prompts for training and inference. Please check out the author's repo https://github.com/tloen/alpaca-lora
+## Generations
+First let's look at some results
+```text
+Instruction : How can I make friends?
+Response : The best way to make friends is to be friendly and approachable. Show interest in other people by asking them about their lives, interests, and hobbies. Make an effort to get to know people better by inviting them to social events and inviting them to join you in activities. Make an effort to get to know people better by asking them about their lives, interests, and hobbies. Show interest in other people by asking them about their lives, interests, and hobbies. Make an effort to get to know people better by inviting them to social events and inviting them to join you in activities. Make an effort to get to know people better by
+```
+```text
+Instruction : Generate an example of what a resume should like for an engineering professional
+Response : An example of a resume for an engineering professional should include the following information: name, contact information, education, professional experience, skills, and references. The education section should include a list of degrees, certifications, and professional experience. The professional experience section should include a list of projects, responsibilities, and accomplishments. The skills section should include a list of technical and non-technical skills. Finally, the references section should include a list of professional references.
+```
+```text
+Instruction : Tell me about lions.
+Response : Lions are one of the largest cats in the world. They are found in Africa, Asia, and parts of the Middle East. They are known for their powerful jaws, sharp claws, and large manes. They are also known for their ability to roar, which can be heard up to a mile away. They are also known for their ability to hunt and kill large prey such as antelopes, zebras, and wildebeests. They are also known for their ability to defend their territory and their cubs. They are also known for their ability to roar, which can be heard up to a mile away
+```
+```text
+Instruction : What are alcapas? How are alpacas different from llamas?
+Response : Alpacas and llamas are both members of the camelid family. Alpacas are smaller than llamas and have shorter legs. Alpacas are native to South America, while llamas are native to the Andes Mountains of South America. Alpacas and llamas are similar in appearance, but alpacas are smaller and have shorter legs than llamas. Alpacas and llamas are both used as pack animals, but alpacas are smaller and have shorter legs than llamas. Alpacas and llamas are both used for milk production, but alpacas are smaller and have shorter legs than
+```
+```text
+Instruction: What is a neural network?
+Response: A neural network is a type of artificial intelligence (AI) system that is inspired by the structure and function of the human brain. It is a type of machine learning algorithm that is used to learn from data and make predictions. Neural networks can be used for a variety of tasks, such as image recognition, speech recognition, natural language processing, and computer vision. They can also be used to solve problems in areas such as robotics, autonomous vehicles, and medical diagnosis. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [
+```
+## Examples
+Install dependencies.
+```shell
+conda create -n py39 python=3.9
+conda activate py39
+pip install transformers tensorboard accelerate datasets torch bitsandbytes git+https://github.com/huggingface/peft.git
+```
+To generate, use the following
+```python
+import torch
+from peft import PeftModel, PeftConfig
+from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
+def generate_prompt(instruction, input=None):
+if input:
+return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
+### Instruction:
+{instruction}
+### Input:
+{input}
+### Response:"""
+else:
+return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request.
+### Instruction:
+{instruction}
+### Response:"""
+def evaluate(
+model,
+tokenizer,
+instruction,
+input=None,
+temperature=0.1,
+top_p=0.75,
+top_k=40,
+num_beams=4,
+max_new_tokens=128,
+**kwargs,
+):
+prompt = generate_prompt(instruction, input)
+inputs = tokenizer(prompt, return_tensors="pt")
+input_ids = inputs["input_ids"].to("cuda")
+generation_config = GenerationConfig(
+temperature=temperature,
+top_p=top_p,
+top_k=top_k,
+num_beams=num_beams,
+**kwargs,
+)
+with torch.no_grad():
+generation_output = model.generate(
+input_ids=input_ids,
+generation_config=generation_config,
+return_dict_in_generate=True,
+output_scores=True,
+max_new_tokens=max_new_tokens,
+)
+s = generation_output#.sequences[0]
+s = s.sequences[0]
+output = tokenizer.decode(s)
+return output.split("### Response:")[1].strip()
+peft_model_id = "jfacevedo/gptj-alpaca"
+model_name = "EleutherAI/gpt-j-6B"
+config = PeftConfig.from_pretrained(peft_model_id)
+print("loading model")
+model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto",revision="float16", load_in_8bit=True)
+print("loading tokenizer")
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+# Padding token should not be required for inference, but adding it since it was added during training
+# Add pad token
+new_tokens = [""]
+# check if the tokens are already in the vocabulary
+new_tokens = set(new_tokens) - set(tokenizer.vocab.keys())
+# add the tokens to the tokenizer vocabulary
+tokenizer.add_tokens(list(new_tokens))
+# add new, random embeddings for the new tokens
+model.resize_token_embeddings(len(tokenizer))
+tokenizer.pad_token = ""
+# Load the Lora model
+instruction = "Describe the structure of an atom."
+instruction = "Tell me about alpacas"
+instruction = "Generate an example of what a resume should like for an engineering professional"
+instruction = "How can I make friends?"
+print("instruction: ",instruction)
+# Testing the original model
+# print("Not finetuned")
+# print("Response:", evaluate(model, tokenizer, instruction))
+# print("\n\n")
+print("loading lora model")
+model = PeftModel.from_pretrained(model, peft_model_id).to("cuda")
+print("Finetuned model")
+print("Response:", evaluate(model, tokenizer, instruction))
+```