jeiku/Aura-NeMo-12B
#2
by
Jebadiah
- opened
No description provided.
Hi @Jebadiah ,
Thanks for visiting.
Could you tell me more of what's on your mind?
I can't tell what you're looking for with this thread.
Hey again @Jebadiah ,
If you're looking to see https://huggingface.co/jeiku/Aura-NeMo-12B available for inference here, unfortunately we aren't able to at this time.
Two blockers:
- our inference stack operates on model cards that are full models. While specifying a LoRA is convenient (and space efficient) way to specify a model, the Featherless model execution pipeline can't use those (yet)
- The base model is a Q4 quant. That's efficient for fine-tuning, but our inference stack runs all models at FP8 and we don't currently support lower quants.
If you find a card that overcomes these two limitations, please let us know!
wxgeorge
changed discussion status to
closed