use_flash_attention_2=True

by TillFetzer - opened May 10

May 10

One Question, because use_flash_attention_2=True does not work. Is trust_remote_code=True the same in these context. Or is it depended on a speciall package version?

bjoernp

LAION LeoLM org May 10

This should work without using trust_remote_code=True. Then you can load the model with attn_implementation="flash_attention_2" or without if you would prefer not to have the flash-attndependency.

TillFetzer

May 13

•

edited May 13

Can you shortly say which transformer version was used, because ValueError: The following model_kwargs are not used by the model: ['attn_implementation']. I will test different, but maybe you can shorten it by stating the right one. I have transformers==4.34.0 can and flash-attn ==2.3.2

TillFetzer

May 13

works for me over # Load model directly with use_flash_attention_2=True

TillFetzer changed discussion status to closed May 13

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment