No triton for windows

by fernandomir - opened May 23

May 23

Any workaround? Anyways, hwo much VRAM to run the microsoft/Phi-3-small-8k-instruct model?
Thanks!

mk1024

May 29

Has a solution been found yet?

Jul 4

Avoid using this model, it's extremly slow

Almost the slowest 7B model I have ever seen.

Am tested on same A100, compare with Qwen2-7B, no quante, just pure compare with raw transformers.

It's just extrem slow...

nguyenbh changed discussion status to closed Aug 30

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment