use_flash_attention_2=True
#9
by
TillFetzer
- opened
One Question, because use_flash_attention_2=True does not work. Is trust_remote_code=True the same in these context. Or is it depended on a speciall package version?
This should work without using trust_remote_code=True
. Then you can load the model with attn_implementation="flash_attention_2"
or without if you would prefer not to have the flash-attn
dependency.
Can you shortly say which transformer version was used, because ValueError: The following model_kwargs
are not used by the model: ['attn_implementation']. I will test different, but maybe you can shorten it by stating the right one. I have transformers==4.34.0 can and flash-attn ==2.3.2
works for me over # Load model directly with use_flash_attention_2=True
TillFetzer
changed discussion status to
closed