Submitted by akhaliq 27 Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory · 4 authors
Submitted by akhaliq 21 Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning · 8 authors
Submitted by akhaliq 18 Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding · 45 authors 2
Submitted by akhaliq 13 Understanding the performance gap between online and offline alignment algorithms · 11 authors
Submitted by akhaliq 11 Compositional Text-to-Image Generation with Dense Blob Representations · 6 authors 1
Submitted by akhaliq 10 No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding · 5 authors
Submitted by akhaliq 8 SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models · 14 authors