RuntimeError: FlashAttention is not installed.

#47

by seregadgl - opened Oct 12

Oct 12

•

Hi, can you tell me how to disable flash_attn?
model = SentenceTransformer("jinaai/jina-embeddings-v3",
device = device, trust_remote_code=True, model_kwargs={'default_task': 'text-matching' })
................
trainer = SentenceTransformerTrainer(
model=model,
args=args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
loss=train_loss,
evaluator=dev_evaluator,

)

trainer.train()
RuntimeError: FlashAttention is not installed. To proceed with training, please install FlashAttention. For inference, you have two options: either install FlashAttention or disable it by setting use_flash_attn=False when loading the model.

Sentence Transformers v3.2

jupyterjazz

Jina AI org Oct 15

Hi @seregadgl , you need to have flash attention installed if you want to train the model, you can only disable it during inference

seregadgl

Oct 15

Thanks for the answer, maybe you can tell me what version of flash attention to install so that I can fine-tune the model in Google Colab on the T4 video card. Thanks!

BlackBeenie

Oct 20

Seems like you also need to install other dependencies (i.e. triton).
If you see rotary.py file, you could find that the RuntimeError: FlashAttention is not installed exception is raised if you failed to run from flash_attn.ops.triton.rotary import apply_rotary.
This line requires both flash attention and triton.
So, I guess you should also install the triton by running pip install triton

jupyterjazz

Jina AI org about 1 month ago

@seregadgl you can install any recent version, the last one (2.6.3) should work fine

@BlackBeenie you're right, it requires triton as well, however triton should be automatically installed as you install torch if cuda is enabled

BlackBeenie

about 1 month ago

@jupyterjazz Seems like triton is not installed automatically in Google Colab. Cos, I also faced similar error, and running the pip install triton actually fixes the issue.

jupyterjazz

Jina AI org about 1 month ago

@BlackBeenie , makes sense. This happens because Colab comes with pre-installed torch. If you uninstall it and reinstall it while connected to a GPU runtime, triton should be installed as well

BlackBeenie

about 1 month ago

@jupyterjazz Just tested it, and it works. Thanks :)

bwang0911 changed discussion status to closed 30 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment