How can we access this model via a HuggingFace Inference Endpoint, rather than by downloading it locally? I am on a Macbook, so I cannot run the model locally due to the flash_attn dependency
flash_attn
· Sign up or log in to comment