doesnt work

by ArtemSI - opened Jun 7

Jun 7

•

when i start this code i get this error: You are not running the flash-attention implementation, expect numerical differences. how to fix it? i have flash_attn installed

lujingxi

Jun 18

Make sure you use transformers==4.37.2. and your GPU version is Ampere or higher (A100, H100 etc). Otherwise, modify the "use_flash_attn" value to "false" in config file and then you can still run it with warning "You are not running the flash-attention implementation, expect numerical differences". I used Tesla 4 to run this demo and the answer still looks fine.

czczup

OpenGVLab org Jul 7

when i start this code i get this error: You are not running the flash-attention implementation, expect numerical differences. how to fix it? i have flash_attn installed

Thank you for your feedback. Now that flash attention is enabled for Phi3, eager attention is automatically used if flash attention is not installed in the environment.

czczup changed discussion status to closed Jul 7

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment