Support for LoRA?
Hi, does this have support for LoRA training? I've tried to finetune the model on this (slightly modified) code: https://github.com/tloen/alpaca-lora/blob/main/finetune.py and it started training but learning rate 0.0, odd loss between 40 - 50 (LLaMA models usually have under 2) and when the adapter was run along with this base model after the training finished, the model gave weird outputs.
If anyone tried to finetune using LoRA and were successful, please share how you did so. Any help is much appreciated! 👍🤝
If someone managed to train with QLoRA, please share your results
you can check the code of alpaca . maybe not change too much code.
https://github.com/tloen/alpaca-lora/blob/main/finetune.py
@zhangbo2008 as you can see in my initial message, I mentioned that I already tried it, just modified the script a bit so it would fit the correct “AutoModelForCasualLM” and “AutoTokenizer” etc. Seems like it is not working correctly, something is wrong.
ok, i will try
Alright. You’ll probably get the training running but if the training loss is too high and it shows learning rate 0.0 chances are the model is training incorrectly (final adapter will then be useless). This might be due to an issue in modeling_RW.py that does not allow support (is not configured) for LoRA
If it works for you, please share your version of the code. Thanks!
@zhangbo2008 I found this: https://github.com/lvwerra/trl/blob/main/examples/sentiment/scripts/gpt-neox-20b_peft/clm_finetune_peft_imdb.py
Maybe slightly modifying this code could do the thing? I’m not home so I can’t run the training but this could possibly work
@cekal you can see this:https://huggingface.co/dfurman/falcon-7b-chat-oasst1/blob/main/finetune_falcon7b_oasst1_with_bnb_peft.ipynb
wish can help you, and I had not try this code.
@cekal
@xiao111
- I can confirm that the mentioned modification in the notebook actually works. I was able to finetune / further train falcon-7b with an instruction following strategy. Keep in mind that after training you need to merge the new weights back into the original model files in order to be able to set trust_remote_code
to True.
thanks for sharing . i am so xxx cause my colab only 15g gpu not work for 4biteint_mode.
I have tried fine-tuning falcon 7b with qlora using axolotl, and it seems to work: https://github.com/OpenAccess-AI-Collective/axolotl/pull/132
If you encounter any issue with the config or spot any problems in the config, please ping me in the PR. Thanks!
@cekal
how did you get it to work finally ? I also tried modifying the Alpaca-LORA code by changing AutoTokenizer & AutoModelForCausalLM.
Also changed lora_target_module to ["query_key_value"].
I get the error ValueError: The length of enable_lora must divide out_features
EDIT - Fixed by updating the packages.
I have tried fine-tuning falcon 7b with qlora using axolotl, and it seems to work: https://github.com/OpenAccess-AI-Collective/axolotl/pull/132
If you encounter any issue with the config or spot any problems in the config, please ping me in the PR. Thanks!
hi, @utensil I have compared qlora.yml and lora.yml on falcon 7B. The main difference seems to be only these fields
load_in_8bit: true
load_in_4bit: false
optimizer: paged_adamw_32bit
Is there any other difference?
If you are interested in finetuning the models, we would recommend having a look at FalconTune (which supports finetuning in 4-bit) or at this blogpost from HF, specifically at the section on finetuning the model with PEFT.
@cekal how did you get it to work finally ? I also tried modifying the Alpaca-LORA code by changing AutoTokenizer & AutoModelForCausalLM.
Also changed lora_target_module to ["query_key_value"].I get the error
ValueError: The length of enable_lora must divide out_features
EDIT - Fixed by updating the packages.
hi sumegh, which package are you referring to ? I get the same error and don't know how to fix it.
@cekal how did you get it to work finally ? I also tried modifying the Alpaca-LORA code by changing AutoTokenizer & AutoModelForCausalLM.
Also changed lora_target_module to ["query_key_value"].I get the error
ValueError: The length of enable_lora must divide out_features
EDIT - Fixed by updating the packages.
hi sumegh, which package are you referring to ? I get the same error and don't know how to fix it.
I updated my CUDA version to 11.8 and re-installed all packages following the jupyter notebook as it is. It worked.
@FalconLLM
Is there any literature published on the internal architecture of the decoder blocks and how they are organized? Are there any plans for a publication anytime soon?
I would like to experiment with only touching certain submodules (instead of all) with LoRA/QLoRA adapters during fine tuning for some understanding on how the self attention block and the mlp block across various decoder layers contribute to overall model performance.