Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2410.24211

DELTA: Dense Efficient Long-range 3D Tracking for any video

Paper • 2410.24211 • Published 28 days ago • 8

DELTA: Dense Efficient Long-range 3D Tracking for any video

Paper • 2410.24211 • Published 28 days ago • 8

Interesting Papers

These papers are interesting (to me)

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Paper • 2410.02740 • Published Oct 3 • 52
From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Paper • 2410.01215 • Published Oct 2 • 30
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25 • 103
EuroLLM: Multilingual Language Models for Europe

Paper • 2409.16235 • Published Sep 24 • 24

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1 • 108
Surgical SAM 2: Real-time Segment Anything in Surgical Video by Efficient Frame Pruning

Paper • 2408.07931 • Published Aug 15 • 19
DELTA: Dense Efficient Long-range 3D Tracking for any video

Paper • 2410.24211 • Published 28 days ago • 8

Video-Gen Trajectory

Tora: Trajectory-oriented Diffusion Transformer for Video Generation

Paper • 2407.21705 • Published Jul 31 • 25
TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Paper • 2408.11475 • Published Aug 21 • 17
TVG: A Training-free Transition Video Generation Method with Diffusion Models

Paper • 2408.13413 • Published Aug 24 • 13
PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Paper • 2409.18964 • Published Sep 27 • 25

LocalMamba: Visual State Space Model with Windowed Selective Scan

Paper • 2403.09338 • Published Mar 14 • 7
GiT: Towards Generalist Vision Transformer through Universal Language Interface

Paper • 2403.09394 • Published Mar 14 • 25
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29 • 32
Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Paper • 2405.10300 • Published May 16 • 26

Company

© Hugging Face

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs