H100 TransformerEngine
#14
by
SinanAkkoyun
- opened
Thank you so much for your awesome work!
Please implement the H100 FP8 faster inference for this model!
When do you suspect will this be done? Thank you!
Hi @SinanAkkoyun , we are working on it but it will likely be ~months away. H100s are only just entering the market and theres a lot of performance tuning to do!
abhi-mosaic
changed discussion status to
closed
@abhi-mosaic LambdaLabs supplies "infinite" H100s now! When do you think will the TE implementation be available? Can I somehow help?
SinanAkkoyun
changed discussion status to
open
The model should work as-is on H100s with BF16.
FP8 support is gonna be a bit trickier but we are working on it: https://github.com/mosaicml/llm-foundry/pull/271
sam-mosaic
changed discussion status to
closed