qlora - need to be applied and few more places
Great work!
it seems you applied Lora only to query_key_value, but Tim in this notebook (also in the paper) said it needs to be applied to additional places to get full-fine tuning results. See the code below:
from peft import LoraConfig
lora_alpha = 16
lora_dropout = 0.1
lora_r = 64
peft_config = LoraConfig(
lora_alpha=lora_alpha,
lora_dropout=lora_dropout,
r=lora_r,
bias="none",
task_type="CAUSAL_LM",
target_modules=[
"query_key_value",
"dense",
"dense_h_to_4h",
"dense_4h_to_h",
]
)
I don't think "need" is the right word. In my implementation I wanted to minimally perturb the base model, and I therefore only applied LoRA to the one layer in the attention block. I think it is up to the end user to decide how much they want to change the base model. Marking this as closed.
"need" for getting similar results to regular 16fb fine-tuning. But yes, agree with the general comment.