-
LinFusion: 1 GPU, 1 Minute, 16K Image
Paper • 2409.02097 • Published • 31 -
Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion
Paper • 2409.11406 • Published • 25 -
Diffusion Models Are Real-Time Game Engines
Paper • 2408.14837 • Published • 121 -
Segment Anything with Multiple Modalities
Paper • 2408.09085 • Published • 21
Collections
Discover the best community collections!
Collections including paper arxiv:2408.10198
-
MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model
Paper • 2408.10198 • Published • 32 -
SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement
Paper • 2408.00653 • Published • 27 -
CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner
Paper • 2405.14979 • Published • 15 -
MeshLRM: Large Reconstruction Model for High-Quality Mesh
Paper • 2404.12385 • Published • 26
-
Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning
Paper • 2407.15762 • Published • 8 -
HuggingFaceTB/SmolLM-135M
Text Generation • Updated • 54.9k • 169 -
MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model
Paper • 2408.10198 • Published • 32 -
fishaudio/fish-speech-1.4
Text-to-Speech • Updated • 5.1k • 403
-
GECO: Generative Image-to-3D within a SECOnd
Paper • 2405.20327 • Published • 9 -
Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion
Paper • 2406.03184 • Published • 18 -
NPGA: Neural Parametric Gaussian Avatars
Paper • 2405.19331 • Published • 10 -
Unified Text-to-Image Generation and Retrieval
Paper • 2406.05814 • Published • 10
-
CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner
Paper • 2405.14979 • Published • 15 -
PLA4D: Pixel-Level Alignments for Text-to-4D Gaussian Splatting
Paper • 2405.19957 • Published • 9 -
GECO: Generative Image-to-3D within a SECOnd
Paper • 2405.20327 • Published • 9 -
gsplat: An Open-Source Library for Gaussian Splatting
Paper • 2409.06765 • Published • 11
-
MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
Paper • 2311.17049 • Published -
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 13 -
A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision
Paper • 2303.17376 • Published -
Sigmoid Loss for Language Image Pre-Training
Paper • 2303.15343 • Published • 4
-
TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion
Paper • 2401.09416 • Published • 9 -
SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild
Paper • 2401.10171 • Published • 12 -
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model
Paper • 2311.09217 • Published • 21 -
GALA: Generating Animatable Layered Assets from a Single Scan
Paper • 2401.12979 • Published • 6
-
MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model
Paper • 2408.10198 • Published • 32 -
SpaRP: Fast 3D Object Reconstruction and Pose Estimation from Sparse Views
Paper • 2408.10195 • Published • 12 -
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion
Paper • 2311.07885 • Published • 39 -
Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model
Paper • 2310.15110 • Published • 2