view article Article Unbelievable! Run 70B LLM Inference on a Single 4GB GPU with This NEW Technique By lyogavin • Nov 30, 2023 • 23
T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation Paper • 2407.14505 • Published Jul 19 • 25
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training Paper • 2410.19313 • Published 30 days ago • 18 • 5
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training Paper • 2410.19313 • Published 30 days ago • 18
COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training Paper • 2410.19313 • Published 30 days ago • 18 • 5
PUMA: Empowering Unified MLLM with Multi-granular Visual Generation Paper • 2410.13861 • Published Oct 17 • 53
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration Paper • 2410.02367 • Published Oct 3 • 47