microsoft
/

Phi-3-small-8k-instruct

Text Generation

Model card Files Files and versions Community

Resources

View closed (24)

Independent evaluation results

#30 opened about 2 months ago by

Getting the error: "triton.runtime.autotuner.OutOfResources: out of resource: shared memory, Required: 180224, Hardware limit: 166912. Reducing block sizes or `num_stages` may help."

#27 opened 4 months ago by

Why the inference speed so slow compare with same 7B parameters of Qwen?

#26 opened 5 months ago by

Upload triton_flash_blocksparse_attn.py

#25 opened 5 months ago by

Phi-3-small doesn't load with TGI

#24 opened 5 months ago by

Multi-GPU training fails when using device_map = "auto"

#23 opened 5 months ago by

Shared memory error

#15 opened 6 months ago by