Papers
arxiv:2309.03903

Tracking Anything with Decoupled Video Segmentation

Published on Sep 7, 2023
· Submitted by akhaliq on Sep 8, 2023
Authors:
,
,
,

Abstract

Training data for video segmentation are expensive to annotate. This impedes extensions of end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary settings. To 'track anything' without training on video data for every individual task, we develop a decoupled video segmentation approach (DEVA), composed of task-specific image-level segmentation and class/task-agnostic bi-directional temporal propagation. Due to this design, we only need an image-level model for the target task (which is cheaper to train) and a universal temporal propagation model which is trained once and generalizes across tasks. To effectively combine these two modules, we use bi-directional propagation for (semi-)online fusion of segmentation hypotheses from different frames to generate a coherent segmentation. We show that this decoupled formulation compares favorably to end-to-end approaches in several data-scarce tasks including large-vocabulary video panoptic segmentation, open-world video segmentation, referring video segmentation, and unsupervised video object segmentation. Code is available at: https://hkchengrex.github.io/Tracking-Anything-with-DEVA

Community

Sigh. Just managed to get the damn thing to work locally.

After installing groundingsam, make sure you can import it before doing anything else. If you get "name _C undefined" error on import, that's because the nvcc in your environment doesn't exist or work correctly.

Paper author

Sigh. Just managed to get the damn thing to work locally.

After installing groundingsam, make sure you can import it before doing anything else. If you get "name _C undefined" error on import, that's because the nvcc in your environment doesn't exist or work correctly.

Hi, thanks for letting us know. We noticed that GroundingDINO might fail silently during installation and added a one-line test on our readme to help with debugging.

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2309.03903 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2309.03903 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2309.03903 in a Space README.md to link it from this page.

Collections including this paper 6