LiheYoung commited on
Commit
9ebc835
1 Parent(s): 74e9abb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -136
README.md CHANGED
@@ -1,136 +1,13 @@
1
- <div align="center">
2
- <h1>Depth Anything V2</h1>
3
-
4
- [**Lihe Yang**](https://liheyoung.github.io/)<sup>1</sup> · [**Bingyi Kang**](https://bingykang.github.io/)<sup>2&dagger;</sup> · [**Zilong Huang**](http://speedinghzl.github.io/)<sup>2</sup>
5
- <br>
6
- [**Zhen Zhao**](http://zhaozhen.me/) · [**Xiaogang Xu**](https://xiaogang00.github.io/) · [**Jiashi Feng**](https://sites.google.com/site/jshfeng/)<sup>2</sup> · [**Hengshuang Zhao**](https://hszhao.github.io/)<sup>1*</sup>
7
-
8
- <sup>1</sup>HKU&emsp;&emsp;&emsp;<sup>2</sup>TikTok
9
- <br>
10
- &dagger;project lead&emsp;*corresponding author
11
-
12
- <a href="https://arxiv.org/abs/2406.09414"><img src='https://img.shields.io/badge/arXiv-Depth Anything V2-red' alt='Paper PDF'></a>
13
- <a href='https://depth-anything-v2.github.io'><img src='https://img.shields.io/badge/Project_Page-Depth Anything V2-green' alt='Project Page'></a>
14
- <a href='https://huggingface.co/spaces/depth-anything/Depth-Anything-V2'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue'></a>
15
- <a href='https://huggingface.co/datasets/depth-anything/DA-2K'><img src='https://img.shields.io/badge/Benchmark-DA--2K-yellow' alt='Benchmark'></a>
16
- </div>
17
-
18
- This work presents Depth Anything V2. It significantly outperforms [V1](https://github.com/LiheYoung/Depth-Anything) in fine-grained details and robustness. Compared with SD-based models, it enjoys faster inference speed, fewer parameters, and higher depth accuracy.
19
-
20
- ![teaser](assets/teaser.png)
21
-
22
- ## News
23
-
24
- - **2024-06-14:** Paper, project page, code, models, demo, and benchmark are all released.
25
-
26
-
27
- ## Pre-trained Models
28
-
29
- We provide **four models** of varying scales for robust relative depth estimation:
30
-
31
- | Model | Params | Checkpoint |
32
- |:-|-:|:-:|
33
- | Depth-Anything-V2-Small | 24.8M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Small/resolve/main/depth_anything_v2_vits.pth?download=true) |
34
- | Depth-Anything-V2-Base | 97.5M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Base/resolve/main/depth_anything_v2_vitb.pth?download=true) |
35
- | Depth-Anything-V2-Large | 335.3M | [Download](https://huggingface.co/depth-anything/Depth-Anything-V2-Large/resolve/main/depth_anything_v2_vitl.pth?download=true) |
36
- | Depth-Anything-V2-Giant | 1.3B | Coming soon |
37
-
38
-
39
- ### Code snippet to use our models
40
- ```python
41
- import cv2
42
- import torch
43
-
44
- from depth_anything_v2.dpt import DepthAnythingV2
45
-
46
- # take depth-anything-v2-large as an example
47
- model = DepthAnythingV2(encoder='vitl', features=256, out_channels=[256, 512, 1024, 1024])
48
- model.load_state_dict(torch.load('checkpoints/depth_anything_v2_vitl.pth', map_location='cpu'))
49
- model.eval()
50
-
51
- raw_img = cv2.imread('your/image/path')
52
- depth = model.infer_image(raw_img) # HxW raw depth map
53
- ```
54
-
55
- ## Usage
56
-
57
- ### Installation
58
-
59
- ```bash
60
- git clone https://github.com/DepthAnything/Depth-Anything-V2
61
- cd Depth-Anything-V2
62
- pip install -r requirements.txt
63
- ```
64
-
65
- ### Running
66
-
67
- ```bash
68
- python run.py --encoder <vits | vitb | vitl | vitg> --img-path <path> --outdir <outdir> [--input-size <size>] [--pred-only] [--grayscale]
69
- ```
70
- Options:
71
- - `--img-path`: You can either 1) point it to an image directory storing all interested images, 2) point it to a single image, or 3) point it to a text file storing all image paths.
72
- - `--input-size` (optional): By default, we use input size `518` for model inference. **You can increase the size for even more fine-grained results.**
73
- - `--pred-only` (optional): Only save the predicted depth map, without raw image.
74
- - `--grayscale` (optional): Save the grayscale depth map, without applying color palette.
75
-
76
- For example:
77
- ```bash
78
- python run.py --encoder vitl --img-path assets/examples --outdir depth_vis
79
- ```
80
-
81
- **If you want to use Depth Anything V2 on videos:**
82
-
83
- ```bash
84
- python run_video.py --encoder vitl --video-path assets/examples_video --outdir video_depth_vis
85
- ```
86
-
87
- *Please note that our larger model has better temporal consistency on videos.*
88
-
89
-
90
- ### Gradio demo
91
-
92
- To use our gradio demo locally:
93
-
94
- ```bash
95
- python app.py
96
- ```
97
-
98
- You can also try our [online demo](https://huggingface.co/spaces/Depth-Anything/Depth-Anything-V2).
99
-
100
- **Note:** Compared to V1, we have made a minor modification to the DINOv2-DPT architecture (originating from this [issue](https://github.com/LiheYoung/Depth-Anything/issues/81)). In V1, we *unintentionally* used features from the last four layers of DINOv2 for decoding. In V2, we use [intermediate features](https://github.com/DepthAnything/Depth-Anything-V2/blob/2cbc36a8ce2cec41d38ee51153f112e87c8e42d8/depth_anything_v2/dpt.py#L164-L169) instead. Although this modification did not improve details or accuracy, we decided to follow this common practice.
101
-
102
-
103
-
104
- ## Fine-tuned to Metric Depth Estimation
105
-
106
- Please refer to [metric depth estimation](./metric_depth).
107
-
108
-
109
- ## DA-2K Evaluation Benchmark
110
-
111
- Please refer to [DA-2K benchmark](./DA-2K.md).
112
-
113
- ## LICENSE
114
-
115
- Depth-Anything-V2-Small model is under the Apache-2.0 license. Depth-Anything-V2-Base/Large/Giant models are under the CC-BY-NC-4.0 license.
116
-
117
-
118
- ## Citation
119
-
120
- If you find this project useful, please consider citing:
121
-
122
- ```bibtex
123
- @article{depth_anything_v2,
124
- title={Depth Anything V2},
125
- author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Zhao, Zhen and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
126
- journal={arXiv:2406.09414},
127
- year={2024}
128
- }
129
-
130
- @inproceedings{depth_anything_v1,
131
- title={Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data},
132
- author={Yang, Lihe and Kang, Bingyi and Huang, Zilong and Xu, Xiaogang and Feng, Jiashi and Zhao, Hengshuang},
133
- booktitle={CVPR},
134
- year={2024}
135
- }
136
- ```
 
1
+ ---
2
+ title: Depth Anything V2
3
+ emoji: 🌖
4
+ colorFrom: red
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ sdk_version: 4.36.0
8
+ app_file: app.py
9
+ pinned: false
10
+ license: apache-2.0
11
+ ---
12
+
13
+ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference