Support for LoRA?

pinned

by cekal - opened May 27, 2023

May 27, 2023

Hi, does this have support for LoRA training? I've tried to finetune the model on this (slightly modified) code: https://github.com/tloen/alpaca-lora/blob/main/finetune.py and it started training but learning rate 0.0, odd loss between 40 - 50 (LLaMA models usually have under 2) and when the adapter was run along with this base model after the training finished, the model gave weird outputs.

If anyone tried to finetune using LoRA and were successful, please share how you did so. Any help is much appreciated! 👍🤝

Kernel

May 28, 2023

If someone managed to train with QLoRA, please share your results

zhangbo2008

May 29, 2023

you can check the code of alpaca . maybe not change too much code.
https://github.com/tloen/alpaca-lora/blob/main/finetune.py

cekal

May 29, 2023

@zhangbo2008 as you can see in my initial message, I mentioned that I already tried it, just modified the script a bit so it would fit the correct “AutoModelForCasualLM” and “AutoTokenizer” etc. Seems like it is not working correctly, something is wrong.

zhangbo2008

May 29, 2023

ok, i will try

cekal

May 29, 2023

Alright. You’ll probably get the training running but if the training loss is too high and it shows learning rate 0.0 chances are the model is training incorrectly (final adapter will then be useless). This might be due to an issue in modeling_RW.py that does not allow support (is not configured) for LoRA

If it works for you, please share your version of the code. Thanks!

cekal

May 29, 2023

@zhangbo2008 I found this: https://github.com/lvwerra/trl/blob/main/examples/sentiment/scripts/gpt-neox-20b_peft/clm_finetune_peft_imdb.py

Maybe slightly modifying this code could do the thing? I’m not home so I can’t run the training but this could possibly work

xiao111

May 30, 2023

@cekal you can see this:https://huggingface.co/dfurman/falcon-7b-chat-oasst1/blob/main/finetune_falcon7b_oasst1_with_bnb_peft.ipynb

wish can help you, and I had not try this code.

JulianGerhard

May 31, 2023

@cekal @xiao111 - I can confirm that the mentioned modification in the notebook actually works. I was able to finetune / further train falcon-7b with an instruction following strategy. Keep in mind that after training you need to merge the new weights back into the original model files in order to be able to set trust_remote_code to True.

FalconLLM pinned discussion May 31, 2023

zhangbo2008

May 31, 2023

thanks for sharing . i am so xxx cause my colab only 15g gpu not work for 4biteint_mode.

utensil

Jun 1, 2023

•

edited Jun 1, 2023

I have tried fine-tuning falcon 7b with qlora using axolotl, and it seems to work: https://github.com/OpenAccess-AI-Collective/axolotl/pull/132

If you encounter any issue with the config or spot any problems in the config, please ping me in the PR. Thanks!

sumegh

Jun 2, 2023

•

edited Jun 2, 2023

@cekal how did you get it to work finally ? I also tried modifying the Alpaca-LORA code by changing AutoTokenizer & AutoModelForCausalLM.
Also changed lora_target_module to ["query_key_value"].

I get the error ValueError: The length of enable_lora must divide out_features

EDIT - Fixed by updating the packages.

cekal

Jun 3, 2023

@sumegh try this: https://huggingface.co/dfurman/falcon-7b-chat-oasst1/blob/main/finetune_falcon7b_oasst1_with_bnb_peft.ipynb

xiao111

Jun 5, 2023

•

edited Jun 5, 2023

I have tried fine-tuning falcon 7b with qlora using axolotl, and it seems to work: https://github.com/OpenAccess-AI-Collective/axolotl/pull/132

If you encounter any issue with the config or spot any problems in the config, please ping me in the PR. Thanks!

hi, @utensil I have compared qlora.yml and lora.yml on falcon 7B. The main difference seems to be only these fields

load_in_8bit: true
load_in_4bit: false
optimizer: paged_adamw_32bit

Is there any other difference?

FalconLLM

Technology Innovation Institute org Jun 9, 2023

If you are interested in finetuning the models, we would recommend having a look at FalconTune (which supports finetuning in 4-bit) or at this blogpost from HF, specifically at the section on finetuning the model with PEFT.

vi-c

Jul 11, 2023

@cekal how did you get it to work finally ? I also tried modifying the Alpaca-LORA code by changing AutoTokenizer & AutoModelForCausalLM.
Also changed lora_target_module to ["query_key_value"].

I get the error ValueError: The length of enable_lora must divide out_features

EDIT - Fixed by updating the packages.

hi sumegh, which package are you referring to ? I get the same error and don't know how to fix it.

sumegh

Jul 11, 2023

@cekal how did you get it to work finally ? I also tried modifying the Alpaca-LORA code by changing AutoTokenizer & AutoModelForCausalLM.
Also changed lora_target_module to ["query_key_value"].

I get the error ValueError: The length of enable_lora must divide out_features

EDIT - Fixed by updating the packages.

hi sumegh, which package are you referring to ? I get the same error and don't know how to fix it.

I updated my CUDA version to 11.8 and re-installed all packages following the jupyter notebook as it is. It worked.

sahilmalhotra17

Sep 15, 2023

•

edited Sep 15, 2023

@FalconLLM
Is there any literature published on the internal architecture of the decoder blocks and how they are organized? Are there any plans for a publication anytime soon?
I would like to experiment with only touching certain submodules (instead of all) with LoRA/QLoRA adapters during fine tuning for some understanding on how the self attention block and the mlp block across various decoder layers contribute to overall model performance.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment