dfurman/Falcon-40B-Chat-v0.1 · qlora - need to be applied and few more places

Jun 7, 2023

Great work!
it seems you applied Lora only to query_key_value, but Tim in this notebook (also in the paper) said it needs to be applied to additional places to get full-fine tuning results. See the code below:

(https://colab.research.google.com/drive/1BiQiw31DT7-cDp1-0ySXvvhzqomTdI-o?usp=sharing&s=08#scrollTo=dQdvjTYTT1vQ)

from peft import LoraConfig

lora_alpha = 16
lora_dropout = 0.1
lora_r = 64

peft_config = LoraConfig(
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
r=lora_r,
bias="none",
task_type="CAUSAL_LM",
target_modules=[
"query_key_value",
"dense",
"dense_h_to_4h",
"dense_4h_to_h",
]
)

dfurmanWMP

Jun 7, 2023

This comment has been hidden

dfurman

Owner Jun 7, 2023

I don't think "need" is the right word. In my implementation I wanted to minimally perturb the base model, and I therefore only applied LoRA to the one layer in the attention block. I think it is up to the end user to decide how much they want to change the base model. Marking this as closed.

dfurman changed discussion status to closed Jun 7, 2023

Asaf-Yehudai

Jun 19, 2023

"need" for getting similar results to regular 16fb fine-tuning. But yes, agree with the general comment.