--- license: apache-2.0 datasets: - yahma/alpaca-cleaned language: - en library_name: transformers --- # Instruction Tuned GPTJ LoRA using the Alpaca Dataset The repo with all the training code can be found at https://github.com/entrpn/serving-model-cards/tree/main/instruct-tune-gptj This model contains the LoRA weights for GPTJ-6B. The model was fine tuned on a T4 GPU using 8-bit quantization, gradient checkpointing and gradient accumulation. This was trained in a Google Cloud Platform Compute Engine spot VM for 3k steps, costing less than $2 dollars. The license should follow the same as [Stanford Alpaca](https://crfm.stanford.edu/2023/03/13/alpaca.html). However, you should be able to use this method with your own dataset and not have the same restrictions. Also want to shout out to @tloen as I used his some of his code to generate the prompts for training and inference. Please check out the author's repo https://github.com/tloen/alpaca-lora ## Generations First let's look at some results ```text Instruction : How can I make friends? Response : The best way to make friends is to be friendly and approachable. Show interest in other people by asking them about their lives, interests, and hobbies. Make an effort to get to know people better by inviting them to social events and inviting them to join you in activities. Make an effort to get to know people better by asking them about their lives, interests, and hobbies. Show interest in other people by asking them about their lives, interests, and hobbies. Make an effort to get to know people better by inviting them to social events and inviting them to join you in activities. Make an effort to get to know people better by ``` ```text Instruction : Generate an example of what a resume should like for an engineering professional Response : An example of a resume for an engineering professional should include the following information: name, contact information, education, professional experience, skills, and references. The education section should include a list of degrees, certifications, and professional experience. The professional experience section should include a list of projects, responsibilities, and accomplishments. The skills section should include a list of technical and non-technical skills. Finally, the references section should include a list of professional references. ``` ```text Instruction : Tell me about lions. Response : Lions are one of the largest cats in the world. They are found in Africa, Asia, and parts of the Middle East. They are known for their powerful jaws, sharp claws, and large manes. They are also known for their ability to roar, which can be heard up to a mile away. They are also known for their ability to hunt and kill large prey such as antelopes, zebras, and wildebeests. They are also known for their ability to defend their territory and their cubs. They are also known for their ability to roar, which can be heard up to a mile away ``` ```text Instruction : What are alcapas? How are alpacas different from llamas? Response : Alpacas and llamas are both members of the camelid family. Alpacas are smaller than llamas and have shorter legs. Alpacas are native to South America, while llamas are native to the Andes Mountains of South America. Alpacas and llamas are similar in appearance, but alpacas are smaller and have shorter legs than llamas. Alpacas and llamas are both used as pack animals, but alpacas are smaller and have shorter legs than llamas. Alpacas and llamas are both used for milk production, but alpacas are smaller and have shorter legs than ``` ```text Instruction: What is a neural network? Response: A neural network is a type of artificial intelligence (AI) system that is inspired by the structure and function of the human brain. It is a type of machine learning algorithm that is used to learn from data and make predictions. Neural networks can be used for a variety of tasks, such as image recognition, speech recognition, natural language processing, and computer vision. They can also be used to solve problems in areas such as robotics, autonomous vehicles, and medical diagnosis. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [ ``` ## Examples Install dependencies. ```shell conda create -n py39 python=3.9 conda activate py39 pip install transformers tensorboard accelerate datasets torch bitsandbytes git+https://github.com/huggingface/peft.git ``` To generate, use the following ```python import torch from peft import PeftModel, PeftConfig from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig def generate_prompt(instruction, input=None): if input: return f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Input: {input} ### Response:""" else: return f"""Below is an instruction that describes a task. Write a response that appropriately completes the request. ### Instruction: {instruction} ### Response:""" def evaluate( model, tokenizer, instruction, input=None, temperature=0.1, top_p=0.75, top_k=40, num_beams=4, max_new_tokens=128, **kwargs, ): prompt = generate_prompt(instruction, input) inputs = tokenizer(prompt, return_tensors="pt") input_ids = inputs["input_ids"].to("cuda") generation_config = GenerationConfig( temperature=temperature, top_p=top_p, top_k=top_k, num_beams=num_beams, **kwargs, ) with torch.no_grad(): generation_output = model.generate( input_ids=input_ids, generation_config=generation_config, return_dict_in_generate=True, output_scores=True, max_new_tokens=max_new_tokens, ) s = generation_output#.sequences[0] s = s.sequences[0] output = tokenizer.decode(s) return output.split("### Response:")[1].strip() peft_model_id = "jfacevedo/gptj-alpaca" model_name = "EleutherAI/gpt-j-6B" config = PeftConfig.from_pretrained(peft_model_id) print("loading model") model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto",revision="float16", load_in_8bit=True) print("loading tokenizer") tokenizer = AutoTokenizer.from_pretrained(model_name) # Padding token should not be required for inference, but adding it since it was added during training # Add pad token new_tokens = [""] # check if the tokens are already in the vocabulary new_tokens = set(new_tokens) - set(tokenizer.vocab.keys()) # add the tokens to the tokenizer vocabulary tokenizer.add_tokens(list(new_tokens)) # add new, random embeddings for the new tokens model.resize_token_embeddings(len(tokenizer)) tokenizer.pad_token = "" # Load the Lora model instruction = "Describe the structure of an atom." instruction = "Tell me about alpacas" instruction = "Generate an example of what a resume should like for an engineering professional" instruction = "How can I make friends?" print("instruction: ",instruction) # Testing the original model # print("Not finetuned") # print("Response:", evaluate(model, tokenizer, instruction)) # print("\n\n") print("loading lora model") model = PeftModel.from_pretrained(model, peft_model_id).to("cuda") print("Finetuned model") print("Response:", evaluate(model, tokenizer, instruction)) ```