Imagine360: Immersive 360 Video Generation from Perspective Anchor
Abstract
360^circ videos offer a hyper-immersive experience that allows the viewers to explore a dynamic scene from full 360 degrees. To achieve more user-friendly and personalized content creation in 360^circ video format, we seek to lift standard perspective videos into 360^circ equirectangular videos. To this end, we introduce Imagine360, the first perspective-to-360^circ video generation framework that creates high-quality 360^circ videos with rich and diverse motion patterns from video anchors. Imagine360 learns fine-grained spherical visual and motion patterns from limited 360^circ video data with several key designs. 1) Firstly we adopt the dual-branch design, including a perspective and a panorama video denoising branch to provide local and global constraints for 360^circ video generation, with motion module and spatial LoRA layers fine-tuned on extended web 360^circ videos. 2) Additionally, an antipodal mask is devised to capture long-range motion dependencies, enhancing the reversed camera motion between antipodal pixels across hemispheres. 3) To handle diverse perspective video inputs, we propose elevation-aware designs that adapt to varying video masking due to changing elevations across frames. Extensive experiments show Imagine360 achieves superior graphics quality and motion coherence among state-of-the-art 360^circ video generation methods. We believe Imagine360 holds promise for advancing personalized, immersive 360^circ video creation.
Community
Imagine360: Immersive 360 Video Generation from Perspective Anchor
Imagine360 lifts standard videos into 360-degree videos with rich and structured motion, enabling dynamic scene experiences from full 360 degrees.
- Project page: https://ys-imtech.github.io/projects/Imagine360
- Arxiv: https://arxiv.org/abs/2412.03552
- Github: https://github.com/YS-IMTech/Imagine360
- Video: https://youtu.be/gRGo4B41GXY
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation (2024)
- Trajectory Attention for Fine-grained Video Motion Control (2024)
- AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers (2024)
- Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation (2024)
- StereoCrafter-Zero: Zero-Shot Stereo Video Generation with Noisy Restart (2024)
- Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization (2024)
- ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper