How to use PEFT+LoRA to fine-tune starchat-alpha

#17
by tonyaw - opened

I want to use PEFT+LoRA to fine-tune starchat-alpha.
I assume "target_modules" shall be set to "starcoder" according to following code:
"utils/other.py"
TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING = {
"t5": ["q", "v"],
"mt5": ["q", "v"],
"bart": ["q_proj", "v_proj"],
"gpt2": ["c_attn"],
"bloom": ["query_key_value"],
"blip-2": ["q", "v", "q_proj", "v_proj"],
"opt": ["q_proj", "v_proj"],
"gptj": ["q_proj", "v_proj"],
"gpt_neox": ["query_key_value"],
"gpt_neo": ["q_proj", "v_proj"],
"bert": ["query", "value"],
"roberta": ["query", "value"],
"xlm-roberta": ["query", "value"],
"electra": ["query", "value"],
"deberta-v2": ["query_proj", "value_proj"],
"deberta": ["in_proj"],
"layoutlm": ["query", "value"],
"llama": ["q_proj", "v_proj"],
"chatglm": ["query_key_value"],
"starcoder": ["c_attn"]

Then I got the following error:
Unhandled Exception
Traceback (most recent call last):
File "./starcoder_train.py", line 410, in
main()
File "./starcoder_train.py", line 329, in main
model = get_peft_model(model, peft_config)
File "/usr/local/lib/python3.8/dist-packages/peft/mapping.py", line 120, in get_peft_model
return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](model, peft_config)
File "/usr/local/lib/python3.8/dist-packages/peft/peft_model.py", line 691, in init
super().init(model, peft_config, adapter_name)
File "/usr/local/lib/python3.8/dist-packages/peft/peft_model.py", line 100, in init
self.base_model = PEFT_TYPE_TO_MODEL_MAPPING[peft_config.peft_type](
File "/usr/local/lib/python3.8/dist-packages/peft/tuners/lora.py", line 174, in init
self.add_adapter(adapter_name, self.peft_config[adapter_name])
File "/usr/local/lib/python3.8/dist-packages/peft/tuners/lora.py", line 181, in add_adapter
self._find_and_replace(adapter_name)
File "/usr/local/lib/python3.8/dist-packages/peft/tuners/lora.py", line 309, in _find_and_replace
raise ValueError(
ValueError: Target modules starcoder not found in the base model. Please check the target modules and try again.

Looks like it is caused by "weight_map" defined in pytorch_model.bin.index.json. All of them are started with transformer, and doesn't contain "starcoder":
"weight_map": {
"lm_head.weight": "pytorch_model-00004-of-00004.bin",
"transformer.h.0.attn.c_attn.bias": "pytorch_model-00001-of-00004.bin",
"transformer.h.0.attn.c_attn.weight": "pytorch_model-00001-of-00004.bin",
"transformer.h.0.attn.c_proj.bias": "pytorch_model-00001-of-00004.bin",
"transformer.h.0.attn.c_proj.weight": "pytorch_model-00001-of-00004.bin",
"transformer.h.0.ln_1.bias": "pytorch_model-00001-of-00004.bin",
"transformer.h.0.ln_1.weight": "pytorch_model-00001-of-00004.bin",

Could you please help to check if my usage is right or is it a bug?

target_modules=['c_attn']

tonyaw changed discussion status to closed

Sign up or log in to comment