Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation
Abstract
Global visual geolocation predicts where an image was captured on Earth. Since images vary in how precisely they can be localized, this task inherently involves a significant degree of ambiguity. However, existing approaches are deterministic and overlook this aspect. In this paper, we aim to close the gap between traditional geolocalization and modern generative methods. We propose the first generative geolocation approach based on diffusion and Riemannian flow matching, where the denoising process operates directly on the Earth's surface. Our model achieves state-of-the-art performance on three visual geolocation benchmarks: OpenStreetView-5M, YFCC-100M, and iNat21. In addition, we introduce the task of probabilistic visual geolocation, where the model predicts a probability distribution over all possible locations instead of a single point. We introduce new metrics and baselines for this task, demonstrating the advantages of our diffusion-based approach. Codes and models will be made available.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- PriorMotion: Generative Class-Agnostic Motion Prediction with Raster-Vector Motion Field Priors (2024)
- Novel View Synthesis with Pixel-Space Diffusion Models (2024)
- Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization (2024)
- Particle-based 6D Object Pose Estimation from Point Clouds using Diffusion Models (2024)
- Generative Location Modeling for Spatially Aware Object Insertion (2024)
- Distillation of Diffusion Features for Semantic Correspondence (2024)
- FLD+: Data-efficient Evaluation Metric for Generative Models (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 3
Datasets citing this paper 0
No dataset linking this paper