LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 9 days ago • 94
Implicit Neural Representations with Fourier Kolmogorov-Arnold Networks Paper • 2409.09323 • Published Sep 14 • 5
EvTexture: Event-driven Texture Enhancement for Video Super-Resolution Paper • 2406.13457 • Published Jun 19 • 16
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published May 16 • 126
view article Article SVGDreamer: Text Guided Vector Graphics Generation with Diffusion Model By xingxm • Apr 19 • 7
SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound Paper • 2406.06612 • Published Jun 6 • 14