Sheldoooon (SheldonXu)

upvoted a paper 7 days ago

Seeing Faces in Things: A Model and Dataset for Pareidolia

Paper • 2409.16143 • Published 8 days ago • 15

upvoted a paper 9 days ago

MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors

Paper • 2409.15273 • Published 9 days ago • 9

upvoted 2 papers 15 days ago

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Paper • 2409.11355 • Published 15 days ago • 26

A Diffusion Approach to Radiance Field Relighting using Multi-Illumination Synthesis

Paper • 2409.08947 • Published 19 days ago • 11

upvoted a paper 20 days ago

Instant Facial Gaussians Translator for Relightable and Interactable Facial Rendering

Paper • 2409.07441 • Published 21 days ago • 9

upvoted 5 papers about 1 month ago

ReconX: Reconstruct Any Scene from Sparse Views with Video Diffusion Model

Paper • 2408.16767 • Published Aug 29 • 29

GenCA: A Text-conditioned Generative Model for Realistic and Drivable Codec Avatars

Paper • 2408.13674 • Published Aug 24 • 17

upvoted 3 papers about 2 months ago

GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI

Paper • 2408.03361 • Published Aug 6 • 85

Sketch2Scene: Automatic Generation of Interactive 3D Game Scenes from User's Casual Sketches

Paper • 2408.04567 • Published Aug 8 • 23

An Object is Worth 64x64 Pixels: Generating 3D Object via Image Diffusion

Paper • 2408.03178 • Published Aug 6 • 36

upvoted a paper 2 months ago

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

Paper • 2407.20183 • Published Jul 29 • 37

upvoted 7 papers 3 months ago

Animate3D: Animating Any 3D Model with Multi-view Video Diffusion

Paper • 2407.11398 • Published Jul 16 • 8

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output

Paper • 2407.03320 • Published Jul 3 • 92

FoleyCrafter: Bring Silent Videos to Life with Lifelike and Synchronized Sounds

Paper • 2407.01494 • Published Jul 1 • 13

Consistency Flow Matching: Defining Straight Flows with Velocity Consistency

Paper • 2407.02398 • Published Jul 2 • 14

The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

Paper • 2406.10601 • Published Jun 15 • 65

MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding

Paper • 2406.14515 • Published Jun 20 • 32

MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers

Paper • 2406.10163 • Published Jun 14 • 32

upvoted 7 papers 4 months ago

SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound

Paper • 2406.06612 • Published Jun 6 • 14

Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis

Paper • 2406.06216 • Published Jun 10 • 18

IllumiNeRF: 3D Relighting without Inverse Rendering

Paper • 2406.06527 • Published Jun 10 • 8

Phased Consistency Model

Paper • 2405.18407 • Published May 28 • 46

Part123: Part-aware 3D Reconstruction from a Single-view Image

Paper • 2405.16888 • Published May 27 • 10

CraftsMan: High-fidelity Mesh Generation with 3D Native Generation and Interactive Geometry Refiner

Paper • 2405.14979 • Published May 23 • 15

Grounded 3D-LLM with Referent Tokens

Paper • 2405.10370 • Published May 16 • 9

upvoted 3 papers 5 months ago

MaPa: Text-driven Photorealistic Material Painting for 3D Shapes

Paper • 2404.17569 • Published Apr 26 • 12

Interactive3D: Create What You Want by Interactive 3D Generation

Paper • 2404.16510 • Published Apr 25 • 18

MeshLRM: Large Reconstruction Model for High-Quality Mesh

Paper • 2404.12385 • Published Apr 18 • 25

upvoted 5 papers 6 months ago

Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion

Paper • 2404.06429 • Published Apr 9 • 6

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Paper • 2404.06512 • Published Apr 9 • 29

InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26 • 28

Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians

Paper • 2403.17898 • Published Mar 26 • 14

FlashFace: Human Image Personalization with High-fidelity Identity Preservation

Paper • 2403.17008 • Published Mar 25 • 18

upvoted 4 papers 7 months ago

SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion

Paper • 2403.12008 • Published Mar 18 • 19

LightIt: Illumination Modeling and Control for Diffusion Models

Paper • 2403.10615 • Published Mar 15 • 16

DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

Paper • 2402.11929 • Published Feb 19 • 9

FlashTex: Fast Relightable Mesh Texturing with LightControlNet

Paper • 2402.13251 • Published Feb 20 • 13

upvoted 2 papers 8 months ago

BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation

Paper • 2401.17053 • Published Jan 30 • 30

Make-A-Shape: a Ten-Million-scale 3D Shape Model

Paper • 2401.11067 • Published Jan 20 • 15

upvoted 6 papers 9 months ago

TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion

Paper • 2401.09416 • Published Jan 17 • 9

URHand: Universal Relightable Hands

Paper • 2401.05334 • Published Jan 10 • 21

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

Paper • 2401.04092 • Published Jan 8 • 20

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3 • 27

DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision

Paper • 2312.16256 • Published Dec 26, 2023 • 15

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

Paper • 2312.14238 • Published Dec 21, 2023 • 14

upvoted 7 papers 10 months ago

Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models

Paper • 2312.13913 • Published Dec 21, 2023 • 22

SpecNeRF: Gaussian Directional Encoding for Specular Reflections

Paper • 2312.13102 • Published Dec 20, 2023 • 5

DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing

Paper • 2312.07409 • Published Dec 12, 2023 • 22

Relightable Gaussian Codec Avatars

Paper • 2312.03704 • Published Dec 6, 2023 • 29

Alchemist: Parametric Control of Material Properties with Diffusion Models

Paper • 2312.02970 • Published Dec 5, 2023 • 7

GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis

Paper • 2312.02155 • Published Dec 4, 2023 • 12

Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering

Paper • 2312.00109 • Published Nov 30, 2023 • 9

upvoted 5 papers 11 months ago

Make Pixels Dance: High-Dynamic Video Generation

Paper • 2311.10982 • Published Nov 18, 2023 • 68

DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model

Paper • 2311.09217 • Published Nov 15, 2023 • 21

Drivable 3D Gaussian Avatars

Paper • 2311.08581 • Published Nov 14, 2023 • 46

Music ControlNet: Multiple Time-varying Controls for Music Generation

Paper • 2311.07069 • Published Nov 13, 2023 • 43

LRM: Large Reconstruction Model for Single Image to 3D

Paper • 2311.04400 • Published Nov 8, 2023 • 47

SheldonXu

AI & ML interests

Organizations

Sheldoooon's activity