zhengrongzhang
commited on
Commit
•
1cff332
1
Parent(s):
b5a92cd
init model
Browse files- README.md +116 -0
- coco.py +226 -0
- demo_utils.py +224 -0
- eval_onnx.py +444 -0
- infer_onnx.py +151 -0
- requirements.txt +9 -0
- yolox-s-int8.onnx +3 -0
README.md
ADDED
@@ -0,0 +1,116 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
tags:
|
4 |
+
- RyzenAI
|
5 |
+
- object-detection
|
6 |
+
- vision
|
7 |
+
- YOLO
|
8 |
+
- anchor-free
|
9 |
+
- pytorch
|
10 |
+
datasets:
|
11 |
+
- coco
|
12 |
+
metrics:
|
13 |
+
- mAP
|
14 |
+
---
|
15 |
+
|
16 |
+
# YOLOX-small model trained on COCO
|
17 |
+
|
18 |
+
YOLOX-small is the small version of YOLOX model trained on COCO object detection (118k annotated images) at resolution 640x640. It was introduced in the paper [YOLOX: Exceeding YOLO Series in 2021](https://arxiv.org/abs/2107.08430) by Zheng Ge et al. and first released in [this repository](https://github.com/Megvii-BaseDetection/YOLOX).
|
19 |
+
|
20 |
+
We develop a modified version that could be supported by [AMD Ryzen AI](https://ryzenai.docs.amd.com).
|
21 |
+
|
22 |
+
|
23 |
+
## Model description
|
24 |
+
|
25 |
+
Based on YOLO detector, the YOLOX model adopts anchor-free head and conducts other advanced detection techniques including decoupled head and the leading label assignment strategy SimOTA to achieve state-of-the-art results across a large scale range of models. The series of models were developed by Megvii Inc. and won the 1st Place on Streaming Perception Challenge (WAD at CVPR 2021).
|
26 |
+
|
27 |
+
|
28 |
+
## Intended uses & limitations
|
29 |
+
|
30 |
+
You can use the raw model for object detection. See the [model hub](https://huggingface.co/models?search=amd/yolox) to look for all available YOLOX models.
|
31 |
+
|
32 |
+
|
33 |
+
## How to use
|
34 |
+
|
35 |
+
### Installation
|
36 |
+
|
37 |
+
Follow [Ryzen AI Installation](https://ryzenai.docs.amd.com/en/latest/inst.html) to prepare the environment for Ryzen AI.
|
38 |
+
Run the following script to install pre-requisites for this model.
|
39 |
+
```sh
|
40 |
+
pip install -r requirements.txt
|
41 |
+
```
|
42 |
+
|
43 |
+
|
44 |
+
### Data Preparation (optional: for accuracy evaluation)
|
45 |
+
|
46 |
+
The dataset MSCOCO2017 contains 118287 images for training and 5000 images for validation.
|
47 |
+
|
48 |
+
Download the validation set of COCO dataset ([val2017.zip](http://images.cocodataset.org/zips/val2017.zip) and [annotations_trainval2017.zip](http://images.cocodataset.org/annotations/annotations_trainval2017.zip)).
|
49 |
+
Then unzip the files and move them to the following directories (or create soft links):
|
50 |
+
|
51 |
+
```plain
|
52 |
+
└── data
|
53 |
+
└── COCO
|
54 |
+
├── annotations
|
55 |
+
| ├── instances_val2017.json
|
56 |
+
| └── ...
|
57 |
+
└── val2017
|
58 |
+
├── 000000000139.jpg
|
59 |
+
├── 000000000285.jpg
|
60 |
+
└── ...
|
61 |
+
```
|
62 |
+
|
63 |
+
|
64 |
+
### Test & Evaluation
|
65 |
+
|
66 |
+
- Code snippet from [`infer_onnx.py`](infer_onnx.py) on how to use
|
67 |
+
```python
|
68 |
+
args = make_parser().parse_args()
|
69 |
+
input_shape = tuple(map(int, args.input_shape.split(',')))
|
70 |
+
origin_img = cv2.imread(args.image_path)
|
71 |
+
img, ratio = preprocess(origin_img, input_shape)
|
72 |
+
if args.ipu:
|
73 |
+
providers = ["VitisAIExecutionProvider"]
|
74 |
+
provider_options = [{"config_file": args.provider_config}]
|
75 |
+
else:
|
76 |
+
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
|
77 |
+
provider_options = None
|
78 |
+
session = ort.InferenceSession(args.model, providers=providers, provider_options=provider_options)
|
79 |
+
ort_inputs = {session.get_inputs()[0].name: img[None, :, :, :]}
|
80 |
+
outputs = session.run(None, ort_inputs)
|
81 |
+
dets = postprocess(outputs, input_shape, ratio)
|
82 |
+
if dets is not None:
|
83 |
+
final_boxes, final_scores, final_cls_inds = dets[:, :4], dets[:, 4], dets[:, 5]
|
84 |
+
origin_img = vis(origin_img, final_boxes, final_scores, final_cls_inds,
|
85 |
+
conf=args.score_thr, class_names=COCO_CLASSES)
|
86 |
+
mkdir(args.output_dir)
|
87 |
+
output_path = os.path.join(args.output_dir, os.path.basename(args.image_path))
|
88 |
+
cv2.imwrite(output_path, origin_img)
|
89 |
+
```
|
90 |
+
|
91 |
+
- Run inference for a single image
|
92 |
+
```sh
|
93 |
+
python infer_onnx.py -m yolox-s-int8.onnx -i Path\To\Your\Image --ipu --provider_config Path\To\vaip_config.json
|
94 |
+
```
|
95 |
+
*Note: __vaip_config.json__ is located at the setup package of Ryzen AI (refer to [Installation](#installation))*
|
96 |
+
|
97 |
+
- Test accuracy of the quantized model
|
98 |
+
```sh
|
99 |
+
python eval_onnx.py -m yolox-s-int8.onnx --ipu --provider_config Path\To\vaip_config.json
|
100 |
+
```
|
101 |
+
|
102 |
+
### Performance
|
103 |
+
|
104 |
+
|Metric | Accuracy on IPU|
|
105 |
+
| :----: | :----: |
|
106 |
+
|AP\@0.50:0.95|0.370|
|
107 |
+
|
108 |
+
|
109 |
+
```bibtex
|
110 |
+
@article{yolox2021,
|
111 |
+
title={YOLOX: Exceeding YOLO Series in 2021},
|
112 |
+
author={Ge, Zheng and Liu, Songtao and Wang, Feng and Li, Zeming and Sun, Jian},
|
113 |
+
journal={arXiv preprint arXiv:2107.08430},
|
114 |
+
year={2021}
|
115 |
+
}
|
116 |
+
```
|
coco.py
ADDED
@@ -0,0 +1,226 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
# -*- coding:utf-8 -*-
|
3 |
+
|
4 |
+
import os
|
5 |
+
import cv2
|
6 |
+
import numpy as np
|
7 |
+
from loguru import logger
|
8 |
+
from functools import wraps
|
9 |
+
from pycocotools.coco import COCO
|
10 |
+
from torch.utils.data.dataset import Dataset as torchDataset
|
11 |
+
|
12 |
+
COCO_CLASSES = (
|
13 |
+
'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant',
|
14 |
+
'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra',
|
15 |
+
'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite',
|
16 |
+
'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork',
|
17 |
+
'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut',
|
18 |
+
'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
|
19 |
+
'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors',
|
20 |
+
'teddy bear', 'hair drier', 'toothbrush')
|
21 |
+
|
22 |
+
|
23 |
+
def remove_useless_info(coco):
|
24 |
+
"""
|
25 |
+
Remove useless info in coco dataset. COCO object is modified inplace.
|
26 |
+
This function is mainly used for saving memory (save about 30% mem).
|
27 |
+
"""
|
28 |
+
if isinstance(coco, COCO):
|
29 |
+
dataset = coco.dataset
|
30 |
+
dataset.pop("info", None)
|
31 |
+
dataset.pop("licenses", None)
|
32 |
+
for img in dataset["images"]:
|
33 |
+
img.pop("license", None)
|
34 |
+
img.pop("coco_url", None)
|
35 |
+
img.pop("date_captured", None)
|
36 |
+
img.pop("flickr_url", None)
|
37 |
+
if "annotations" in coco.dataset:
|
38 |
+
for anno in coco.dataset["annotations"]:
|
39 |
+
anno.pop("segmentation", None)
|
40 |
+
|
41 |
+
|
42 |
+
class Dataset(torchDataset):
|
43 |
+
""" This class is a subclass of the base :class:`torch.utils.data.Dataset`,
|
44 |
+
that enables on the fly resizing of the ``input_dim``.
|
45 |
+
|
46 |
+
Args:
|
47 |
+
input_dimension (tuple): (width,height) tuple with default dimensions of the network
|
48 |
+
"""
|
49 |
+
|
50 |
+
def __init__(self, input_dimension, mosaic=True):
|
51 |
+
super().__init__()
|
52 |
+
self.__input_dim = input_dimension[:2]
|
53 |
+
self.enable_mosaic = mosaic
|
54 |
+
|
55 |
+
@property
|
56 |
+
def input_dim(self):
|
57 |
+
"""
|
58 |
+
Dimension that can be used by transforms to set the correct image size, etc.
|
59 |
+
This allows transforms to have a single source of truth
|
60 |
+
for the input dimension of the network.
|
61 |
+
|
62 |
+
Return:
|
63 |
+
list: Tuple containing the current width,height
|
64 |
+
"""
|
65 |
+
if hasattr(self, "_input_dim"):
|
66 |
+
return self._input_dim
|
67 |
+
return self.__input_dim
|
68 |
+
|
69 |
+
@staticmethod
|
70 |
+
def mosaic_getitem(getitem_fn):
|
71 |
+
"""
|
72 |
+
Decorator method that needs to be used around the ``__getitem__`` method. |br|
|
73 |
+
This decorator enables the closing mosaic
|
74 |
+
|
75 |
+
Example:
|
76 |
+
>>> class CustomSet(ln.data.Dataset):
|
77 |
+
... def __len__(self):
|
78 |
+
... return 10
|
79 |
+
... @ln.data.Dataset.mosaic_getitem
|
80 |
+
... def __getitem__(self, index):
|
81 |
+
... return self.enable_mosaic
|
82 |
+
"""
|
83 |
+
|
84 |
+
@wraps(getitem_fn)
|
85 |
+
def wrapper(self, index):
|
86 |
+
if not isinstance(index, int):
|
87 |
+
self.enable_mosaic = index[0]
|
88 |
+
index = index[1]
|
89 |
+
ret_val = getitem_fn(self, index)
|
90 |
+
return ret_val
|
91 |
+
|
92 |
+
return wrapper
|
93 |
+
|
94 |
+
|
95 |
+
class COCODataset(Dataset):
|
96 |
+
"""
|
97 |
+
COCO dataset class.
|
98 |
+
"""
|
99 |
+
|
100 |
+
def __init__(
|
101 |
+
self,
|
102 |
+
data_dir='data/COCO',
|
103 |
+
json_file="instances_train2017.json",
|
104 |
+
name="train2017",
|
105 |
+
img_size=(416, 416),
|
106 |
+
preproc=None
|
107 |
+
):
|
108 |
+
"""
|
109 |
+
COCO dataset initialization. Annotation data are read into memory by COCO API.
|
110 |
+
Args:
|
111 |
+
data_dir (str): dataset root directory
|
112 |
+
json_file (str): COCO json file name
|
113 |
+
name (str): COCO data name (e.g. 'train2017' or 'val2017')
|
114 |
+
img_size (tuple(int)): target image size after pre-processing
|
115 |
+
preproc: data augmentation strategy
|
116 |
+
"""
|
117 |
+
super().__init__(img_size)
|
118 |
+
self.data_dir = data_dir
|
119 |
+
self.json_file = json_file
|
120 |
+
self.coco = COCO(os.path.join(self.data_dir, "annotations", self.json_file))
|
121 |
+
remove_useless_info(self.coco)
|
122 |
+
self.ids = self.coco.getImgIds()
|
123 |
+
self.class_ids = sorted(self.coco.getCatIds())
|
124 |
+
self.cats = self.coco.loadCats(self.coco.getCatIds())
|
125 |
+
self._classes = tuple([c["name"] for c in self.cats])
|
126 |
+
self.imgs = None
|
127 |
+
self.name = name
|
128 |
+
self.img_size = img_size
|
129 |
+
self.preproc = preproc
|
130 |
+
self.annotations = self._load_coco_annotations()
|
131 |
+
|
132 |
+
def __len__(self):
|
133 |
+
return len(self.ids)
|
134 |
+
|
135 |
+
def __del__(self):
|
136 |
+
del self.imgs
|
137 |
+
|
138 |
+
def _load_coco_annotations(self):
|
139 |
+
return [self.load_anno_from_ids(_ids) for _ids in self.ids]
|
140 |
+
|
141 |
+
def load_anno_from_ids(self, id_):
|
142 |
+
im_ann = self.coco.loadImgs(id_)[0]
|
143 |
+
width = im_ann["width"]
|
144 |
+
height = im_ann["height"]
|
145 |
+
anno_ids = self.coco.getAnnIds(imgIds=[int(id_)], iscrowd=False)
|
146 |
+
annotations = self.coco.loadAnns(anno_ids)
|
147 |
+
objs = []
|
148 |
+
for obj in annotations:
|
149 |
+
x1 = np.max((0, obj["bbox"][0]))
|
150 |
+
y1 = np.max((0, obj["bbox"][1]))
|
151 |
+
x2 = np.min((width, x1 + np.max((0, obj["bbox"][2]))))
|
152 |
+
y2 = np.min((height, y1 + np.max((0, obj["bbox"][3]))))
|
153 |
+
if obj["area"] > 0 and x2 >= x1 and y2 >= y1:
|
154 |
+
obj["clean_bbox"] = [x1, y1, x2, y2]
|
155 |
+
objs.append(obj)
|
156 |
+
num_objs = len(objs)
|
157 |
+
res = np.zeros((num_objs, 5))
|
158 |
+
for ix, obj in enumerate(objs):
|
159 |
+
cls = self.class_ids.index(obj["category_id"])
|
160 |
+
res[ix, 0:4] = obj["clean_bbox"]
|
161 |
+
res[ix, 4] = cls
|
162 |
+
r = min(self.img_size[0] / height, self.img_size[1] / width)
|
163 |
+
res[:, :4] *= r
|
164 |
+
img_info = (height, width)
|
165 |
+
resized_info = (int(height * r), int(width * r))
|
166 |
+
file_name = (
|
167 |
+
im_ann["file_name"]
|
168 |
+
if "file_name" in im_ann
|
169 |
+
else "{:012}".format(id_) + ".jpg"
|
170 |
+
)
|
171 |
+
return res, img_info, resized_info, file_name
|
172 |
+
|
173 |
+
def load_anno(self, index):
|
174 |
+
return self.annotations[index][0]
|
175 |
+
|
176 |
+
def load_resized_img(self, index):
|
177 |
+
img = self.load_image(index)
|
178 |
+
r = min(self.img_size[0] / img.shape[0], self.img_size[1] / img.shape[1])
|
179 |
+
resized_img = cv2.resize(
|
180 |
+
img,
|
181 |
+
(int(img.shape[1] * r), int(img.shape[0] * r)),
|
182 |
+
interpolation=cv2.INTER_LINEAR,
|
183 |
+
).astype(np.uint8)
|
184 |
+
return resized_img
|
185 |
+
|
186 |
+
def load_image(self, index):
|
187 |
+
file_name = self.annotations[index][3]
|
188 |
+
img_file = os.path.join(self.data_dir, self.name, file_name)
|
189 |
+
img = cv2.imread(img_file)
|
190 |
+
assert img is not None, f"file named {img_file} not found"
|
191 |
+
return img
|
192 |
+
|
193 |
+
def pull_item(self, index):
|
194 |
+
id_ = self.ids[index]
|
195 |
+
res, img_info, resized_info, _ = self.annotations[index]
|
196 |
+
if self.imgs is not None:
|
197 |
+
pad_img = self.imgs[index]
|
198 |
+
img = pad_img[: resized_info[0], : resized_info[1], :].copy()
|
199 |
+
else:
|
200 |
+
img = self.load_resized_img(index)
|
201 |
+
return img, res.copy(), img_info, np.array([id_])
|
202 |
+
|
203 |
+
@Dataset.mosaic_getitem
|
204 |
+
def __getitem__(self, index):
|
205 |
+
"""
|
206 |
+
One image / label pair for the given index is picked up and pre-processed.
|
207 |
+
|
208 |
+
Args:
|
209 |
+
index (int): data index
|
210 |
+
|
211 |
+
Returns:
|
212 |
+
img (numpy.ndarray): pre-processed image
|
213 |
+
target (torch.Tensor): pre-processed label data.
|
214 |
+
The shape is :math:`[max_labels, 5]`.
|
215 |
+
each label consists of [class, xc, yc, w, h]:
|
216 |
+
class (float): class index.
|
217 |
+
xc, yc (float) : center of bbox whose values range from 0 to 1.
|
218 |
+
w, h (float) : size of bbox whose values range from 0 to 1.
|
219 |
+
img_info : tuple of h, w.
|
220 |
+
h, w (int): original shape of the image
|
221 |
+
img_id (int): same as the input index. Used for evaluation.
|
222 |
+
"""
|
223 |
+
img, target, img_info, img_id = self.pull_item(index)
|
224 |
+
if self.preproc is not None:
|
225 |
+
img, target = self.preproc(img, target, self.input_dim)
|
226 |
+
return img, target, img_info, img_id
|
demo_utils.py
ADDED
@@ -0,0 +1,224 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
# -*- coding:utf-8 -*-
|
3 |
+
|
4 |
+
import os
|
5 |
+
import cv2
|
6 |
+
import numpy as np
|
7 |
+
|
8 |
+
|
9 |
+
def mkdir(path):
|
10 |
+
if not os.path.exists(path):
|
11 |
+
os.makedirs(path)
|
12 |
+
|
13 |
+
|
14 |
+
def nms(boxes, scores, nms_thr):
|
15 |
+
"""Single class NMS implemented in Numpy."""
|
16 |
+
x1 = boxes[:, 0]
|
17 |
+
y1 = boxes[:, 1]
|
18 |
+
x2 = boxes[:, 2]
|
19 |
+
y2 = boxes[:, 3]
|
20 |
+
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
|
21 |
+
order = scores.argsort()[::-1]
|
22 |
+
keep = []
|
23 |
+
while order.size > 0:
|
24 |
+
i = order[0]
|
25 |
+
keep.append(i)
|
26 |
+
xx1 = np.maximum(x1[i], x1[order[1:]])
|
27 |
+
yy1 = np.maximum(y1[i], y1[order[1:]])
|
28 |
+
xx2 = np.minimum(x2[i], x2[order[1:]])
|
29 |
+
yy2 = np.minimum(y2[i], y2[order[1:]])
|
30 |
+
w = np.maximum(0.0, xx2 - xx1 + 1)
|
31 |
+
h = np.maximum(0.0, yy2 - yy1 + 1)
|
32 |
+
inter = w * h
|
33 |
+
ovr = inter / (areas[i] + areas[order[1:]] - inter)
|
34 |
+
inds = np.where(ovr <= nms_thr)[0]
|
35 |
+
order = order[inds + 1]
|
36 |
+
return keep
|
37 |
+
|
38 |
+
|
39 |
+
def multiclass_nms(boxes, scores, nms_thr, score_thr, class_agnostic=True):
|
40 |
+
"""Multiclass NMS implemented in Numpy"""
|
41 |
+
if class_agnostic:
|
42 |
+
nms_method = multiclass_nms_class_agnostic
|
43 |
+
else:
|
44 |
+
nms_method = multiclass_nms_class_aware
|
45 |
+
return nms_method(boxes, scores, nms_thr, score_thr)
|
46 |
+
|
47 |
+
|
48 |
+
def multiclass_nms_class_aware(boxes, scores, nms_thr, score_thr):
|
49 |
+
"""Multiclass NMS implemented in Numpy. Class-aware version."""
|
50 |
+
final_dets = []
|
51 |
+
num_classes = scores.shape[1]
|
52 |
+
for cls_ind in range(num_classes):
|
53 |
+
cls_scores = scores[:, cls_ind]
|
54 |
+
valid_score_mask = cls_scores > score_thr
|
55 |
+
if valid_score_mask.sum() == 0:
|
56 |
+
continue
|
57 |
+
else:
|
58 |
+
valid_scores = cls_scores[valid_score_mask]
|
59 |
+
valid_boxes = boxes[valid_score_mask]
|
60 |
+
keep = nms(valid_boxes, valid_scores, nms_thr)
|
61 |
+
if len(keep) > 0:
|
62 |
+
cls_inds = np.ones((len(keep), 1)) * cls_ind
|
63 |
+
dets = np.concatenate(
|
64 |
+
[valid_boxes[keep], valid_scores[keep, None], cls_inds], 1
|
65 |
+
)
|
66 |
+
final_dets.append(dets)
|
67 |
+
if len(final_dets) == 0:
|
68 |
+
return None
|
69 |
+
return np.concatenate(final_dets, 0)
|
70 |
+
|
71 |
+
|
72 |
+
def multiclass_nms_class_agnostic(boxes, scores, nms_thr, score_thr):
|
73 |
+
"""Multiclass NMS implemented in Numpy. Class-agnostic version."""
|
74 |
+
cls_inds = scores.argmax(1)
|
75 |
+
cls_scores = scores[np.arange(len(cls_inds)), cls_inds]
|
76 |
+
valid_score_mask = cls_scores > score_thr
|
77 |
+
if valid_score_mask.sum() == 0:
|
78 |
+
return None
|
79 |
+
valid_scores = cls_scores[valid_score_mask]
|
80 |
+
valid_boxes = boxes[valid_score_mask]
|
81 |
+
valid_cls_inds = cls_inds[valid_score_mask]
|
82 |
+
keep = nms(valid_boxes, valid_scores, nms_thr)
|
83 |
+
if keep:
|
84 |
+
dets = np.concatenate(
|
85 |
+
[valid_boxes[keep], valid_scores[keep, None], valid_cls_inds[keep, None]], 1
|
86 |
+
)
|
87 |
+
return dets
|
88 |
+
|
89 |
+
|
90 |
+
def demo_postprocess(outputs, img_size, p6=False):
|
91 |
+
grids = []
|
92 |
+
expanded_strides = []
|
93 |
+
if not p6:
|
94 |
+
strides = [8, 16, 32]
|
95 |
+
else:
|
96 |
+
strides = [8, 16, 32, 64]
|
97 |
+
hsizes = [img_size[0] // stride for stride in strides]
|
98 |
+
wsizes = [img_size[1] // stride for stride in strides]
|
99 |
+
for hsize, wsize, stride in zip(hsizes, wsizes, strides):
|
100 |
+
xv, yv = np.meshgrid(np.arange(wsize), np.arange(hsize))
|
101 |
+
grid = np.stack((xv, yv), 2).reshape(1, -1, 2)
|
102 |
+
grids.append(grid)
|
103 |
+
shape = grid.shape[:2]
|
104 |
+
expanded_strides.append(np.full((*shape, 1), stride))
|
105 |
+
grids = np.concatenate(grids, 1)
|
106 |
+
expanded_strides = np.concatenate(expanded_strides, 1)
|
107 |
+
outputs[..., :2] = (outputs[..., :2] + grids) * expanded_strides
|
108 |
+
outputs[..., 2:4] = np.exp(outputs[..., 2:4]) * expanded_strides
|
109 |
+
return outputs
|
110 |
+
|
111 |
+
|
112 |
+
def vis(img, boxes, scores, cls_ids, conf=0.5, class_names=None):
|
113 |
+
for i in range(len(boxes)):
|
114 |
+
box = boxes[i]
|
115 |
+
cls_id = int(cls_ids[i])
|
116 |
+
score = scores[i]
|
117 |
+
if score < conf:
|
118 |
+
continue
|
119 |
+
x0 = int(box[0])
|
120 |
+
y0 = int(box[1])
|
121 |
+
x1 = int(box[2])
|
122 |
+
y1 = int(box[3])
|
123 |
+
color = (_COLORS[cls_id] * 255).astype(np.uint8).tolist()
|
124 |
+
text = '{}:{:.1f}%'.format(class_names[cls_id], score * 100)
|
125 |
+
txt_color = (0, 0, 0) if np.mean(_COLORS[cls_id]) > 0.5 else (255, 255, 255)
|
126 |
+
font = cv2.FONT_HERSHEY_SIMPLEX
|
127 |
+
txt_size = cv2.getTextSize(text, font, 0.4, 1)[0]
|
128 |
+
cv2.rectangle(img, (x0, y0), (x1, y1), color, 2)
|
129 |
+
txt_bk_color = (_COLORS[cls_id] * 255 * 0.7).astype(np.uint8).tolist()
|
130 |
+
cv2.rectangle(
|
131 |
+
img,
|
132 |
+
(x0, y0 + 1),
|
133 |
+
(x0 + txt_size[0] + 1, y0 + int(1.5*txt_size[1])),
|
134 |
+
txt_bk_color,
|
135 |
+
-1
|
136 |
+
)
|
137 |
+
cv2.putText(img, text, (x0, y0 + txt_size[1]), font, 0.4, txt_color, thickness=1)
|
138 |
+
return img
|
139 |
+
|
140 |
+
|
141 |
+
_COLORS = np.array(
|
142 |
+
[
|
143 |
+
0.000, 0.447, 0.741,
|
144 |
+
0.850, 0.325, 0.098,
|
145 |
+
0.929, 0.694, 0.125,
|
146 |
+
0.494, 0.184, 0.556,
|
147 |
+
0.466, 0.674, 0.188,
|
148 |
+
0.301, 0.745, 0.933,
|
149 |
+
0.635, 0.078, 0.184,
|
150 |
+
0.300, 0.300, 0.300,
|
151 |
+
0.600, 0.600, 0.600,
|
152 |
+
1.000, 0.000, 0.000,
|
153 |
+
1.000, 0.500, 0.000,
|
154 |
+
0.749, 0.749, 0.000,
|
155 |
+
0.000, 1.000, 0.000,
|
156 |
+
0.000, 0.000, 1.000,
|
157 |
+
0.667, 0.000, 1.000,
|
158 |
+
0.333, 0.333, 0.000,
|
159 |
+
0.333, 0.667, 0.000,
|
160 |
+
0.333, 1.000, 0.000,
|
161 |
+
0.667, 0.333, 0.000,
|
162 |
+
0.667, 0.667, 0.000,
|
163 |
+
0.667, 1.000, 0.000,
|
164 |
+
1.000, 0.333, 0.000,
|
165 |
+
1.000, 0.667, 0.000,
|
166 |
+
1.000, 1.000, 0.000,
|
167 |
+
0.000, 0.333, 0.500,
|
168 |
+
0.000, 0.667, 0.500,
|
169 |
+
0.000, 1.000, 0.500,
|
170 |
+
0.333, 0.000, 0.500,
|
171 |
+
0.333, 0.333, 0.500,
|
172 |
+
0.333, 0.667, 0.500,
|
173 |
+
0.333, 1.000, 0.500,
|
174 |
+
0.667, 0.000, 0.500,
|
175 |
+
0.667, 0.333, 0.500,
|
176 |
+
0.667, 0.667, 0.500,
|
177 |
+
0.667, 1.000, 0.500,
|
178 |
+
1.000, 0.000, 0.500,
|
179 |
+
1.000, 0.333, 0.500,
|
180 |
+
1.000, 0.667, 0.500,
|
181 |
+
1.000, 1.000, 0.500,
|
182 |
+
0.000, 0.333, 1.000,
|
183 |
+
0.000, 0.667, 1.000,
|
184 |
+
0.000, 1.000, 1.000,
|
185 |
+
0.333, 0.000, 1.000,
|
186 |
+
0.333, 0.333, 1.000,
|
187 |
+
0.333, 0.667, 1.000,
|
188 |
+
0.333, 1.000, 1.000,
|
189 |
+
0.667, 0.000, 1.000,
|
190 |
+
0.667, 0.333, 1.000,
|
191 |
+
0.667, 0.667, 1.000,
|
192 |
+
0.667, 1.000, 1.000,
|
193 |
+
1.000, 0.000, 1.000,
|
194 |
+
1.000, 0.333, 1.000,
|
195 |
+
1.000, 0.667, 1.000,
|
196 |
+
0.333, 0.000, 0.000,
|
197 |
+
0.500, 0.000, 0.000,
|
198 |
+
0.667, 0.000, 0.000,
|
199 |
+
0.833, 0.000, 0.000,
|
200 |
+
1.000, 0.000, 0.000,
|
201 |
+
0.000, 0.167, 0.000,
|
202 |
+
0.000, 0.333, 0.000,
|
203 |
+
0.000, 0.500, 0.000,
|
204 |
+
0.000, 0.667, 0.000,
|
205 |
+
0.000, 0.833, 0.000,
|
206 |
+
0.000, 1.000, 0.000,
|
207 |
+
0.000, 0.000, 0.167,
|
208 |
+
0.000, 0.000, 0.333,
|
209 |
+
0.000, 0.000, 0.500,
|
210 |
+
0.000, 0.000, 0.667,
|
211 |
+
0.000, 0.000, 0.833,
|
212 |
+
0.000, 0.000, 1.000,
|
213 |
+
0.000, 0.000, 0.000,
|
214 |
+
0.143, 0.143, 0.143,
|
215 |
+
0.286, 0.286, 0.286,
|
216 |
+
0.429, 0.429, 0.429,
|
217 |
+
0.571, 0.571, 0.571,
|
218 |
+
0.714, 0.714, 0.714,
|
219 |
+
0.857, 0.857, 0.857,
|
220 |
+
0.000, 0.447, 0.741,
|
221 |
+
0.314, 0.717, 0.741,
|
222 |
+
0.50, 0.5, 0
|
223 |
+
]
|
224 |
+
).astype(np.float32).reshape(-1, 3)
|
eval_onnx.py
ADDED
@@ -0,0 +1,444 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
# -*- coding:utf-8 -*-
|
3 |
+
|
4 |
+
import io
|
5 |
+
import sys
|
6 |
+
import cv2
|
7 |
+
import json
|
8 |
+
import time
|
9 |
+
import pathlib
|
10 |
+
import argparse
|
11 |
+
import tempfile
|
12 |
+
import itertools
|
13 |
+
import contextlib
|
14 |
+
import torch
|
15 |
+
import torchvision
|
16 |
+
import numpy as np
|
17 |
+
import onnxruntime as ort
|
18 |
+
from tqdm import tqdm
|
19 |
+
from loguru import logger
|
20 |
+
from tabulate import tabulate
|
21 |
+
from collections import defaultdict
|
22 |
+
from pycocotools.cocoeval import COCOeval
|
23 |
+
|
24 |
+
CURRENT_DIR = pathlib.Path(__file__).parent
|
25 |
+
sys.path.append(str(CURRENT_DIR))
|
26 |
+
|
27 |
+
from coco import COCO_CLASSES
|
28 |
+
|
29 |
+
|
30 |
+
class COCOEvaluator:
|
31 |
+
"""
|
32 |
+
COCO AP Evaluation class. All the data in the val2017 dataset are processed
|
33 |
+
and evaluated by COCO API.
|
34 |
+
"""
|
35 |
+
|
36 |
+
def __init__(
|
37 |
+
self,
|
38 |
+
dataloader,
|
39 |
+
img_size: int,
|
40 |
+
confthre: float,
|
41 |
+
nmsthre: float,
|
42 |
+
num_classes: int,
|
43 |
+
testdev: bool = False,
|
44 |
+
per_class_AP: bool = False,
|
45 |
+
per_class_AR: bool = False,
|
46 |
+
):
|
47 |
+
"""
|
48 |
+
Args:
|
49 |
+
dataloader (Dataloader): evaluate dataloader.
|
50 |
+
img_size: image size after preprocess. images are resized
|
51 |
+
to squares whose shape is (img_size, img_size).
|
52 |
+
confthre: confidence threshold ranging from 0 to 1, which
|
53 |
+
is defined in the config file.
|
54 |
+
nmsthre: IoU threshold of non-max supression ranging from 0 to 1.
|
55 |
+
num_classes: number of all classes of interest.
|
56 |
+
testdev: whether run on the testdev set of COCO.
|
57 |
+
per_class_AP: Show per class AP during evalution or not. Default to False.
|
58 |
+
per_class_AR: Show per class AR during evalution or not. Default to False.
|
59 |
+
"""
|
60 |
+
self.dataloader = dataloader
|
61 |
+
self.img_size = img_size
|
62 |
+
self.confthre = confthre
|
63 |
+
self.nmsthre = nmsthre
|
64 |
+
self.num_classes = num_classes
|
65 |
+
self.testdev = testdev
|
66 |
+
self.per_class_AP = per_class_AP
|
67 |
+
self.per_class_AR = per_class_AR
|
68 |
+
|
69 |
+
def evaluate(self, ort_sess, return_outputs=False):
|
70 |
+
"""
|
71 |
+
COCO average precision (AP) Evaluation. Iterate inference on the test dataset
|
72 |
+
and the results are evaluated by COCO API.
|
73 |
+
|
74 |
+
NOTE: This function will change training mode to False, please save states if needed.
|
75 |
+
|
76 |
+
Args:
|
77 |
+
ort_sess (onnxruntime.InferenceSession): onnxruntime session to evaluate.
|
78 |
+
return_outputs (bool): flag indicates whether return image-wise result or not
|
79 |
+
|
80 |
+
Returns:
|
81 |
+
eval_results (tuple): summary of metrics for evaluation
|
82 |
+
output_data (defaultdict): image-wise result
|
83 |
+
"""
|
84 |
+
data_list = []
|
85 |
+
output_data = defaultdict()
|
86 |
+
inference_time = 0
|
87 |
+
nms_time = 0
|
88 |
+
n_samples = max(len(self.dataloader) - 1, 1)
|
89 |
+
input_name = ort_sess.get_inputs()[0].name
|
90 |
+
for cur_iter, (imgs, _, info_imgs, ids) in enumerate(tqdm(self.dataloader)):
|
91 |
+
# with torch.no_grad():
|
92 |
+
# skip the last iters since batchsize might be not enough for batch inference
|
93 |
+
is_time_record = cur_iter < len(self.dataloader) - 1
|
94 |
+
if is_time_record:
|
95 |
+
start = time.time()
|
96 |
+
outputs = ort_sess.run(None, {input_name: imgs.numpy()})
|
97 |
+
outputs = [torch.Tensor(out) for out in outputs]
|
98 |
+
outputs = head_postprocess(outputs)
|
99 |
+
if is_time_record:
|
100 |
+
infer_end = time.time()
|
101 |
+
inference_time += infer_end - start
|
102 |
+
outputs = postprocess(outputs, self.num_classes, self.confthre, self.nmsthre)
|
103 |
+
if is_time_record:
|
104 |
+
nms_end = time.time()
|
105 |
+
nms_time += nms_end - infer_end
|
106 |
+
data_list_elem, image_wise_data = self.convert_to_coco_format(
|
107 |
+
outputs, info_imgs, ids, return_outputs=True)
|
108 |
+
data_list.extend(data_list_elem)
|
109 |
+
output_data.update(image_wise_data)
|
110 |
+
statistics = [inference_time, nms_time, n_samples]
|
111 |
+
eval_results = self.evaluate_prediction(data_list, statistics)
|
112 |
+
if return_outputs:
|
113 |
+
return eval_results, output_data
|
114 |
+
return eval_results
|
115 |
+
|
116 |
+
def convert_to_coco_format(self, outputs, info_imgs, ids, return_outputs=False):
|
117 |
+
data_list = []
|
118 |
+
image_wise_data = defaultdict(dict)
|
119 |
+
for (output, img_h, img_w, img_id) in zip(
|
120 |
+
outputs, info_imgs[0], info_imgs[1], ids
|
121 |
+
):
|
122 |
+
if output is None:
|
123 |
+
continue
|
124 |
+
output = output.cpu()
|
125 |
+
bboxes = output[:, 0:4]
|
126 |
+
# preprocessing: resize
|
127 |
+
scale = min(
|
128 |
+
self.img_size[0] / float(img_h), self.img_size[1] / float(img_w)
|
129 |
+
)
|
130 |
+
bboxes /= scale
|
131 |
+
cls = output[:, 6]
|
132 |
+
scores = output[:, 4] * output[:, 5]
|
133 |
+
image_wise_data.update({
|
134 |
+
int(img_id): {
|
135 |
+
"bboxes": [box.numpy().tolist() for box in bboxes],
|
136 |
+
"scores": [score.numpy().item() for score in scores],
|
137 |
+
"categories": [
|
138 |
+
self.dataloader.dataset.class_ids[int(cls[ind])]
|
139 |
+
for ind in range(bboxes.shape[0])
|
140 |
+
],
|
141 |
+
}
|
142 |
+
})
|
143 |
+
bboxes = xyxy2xywh(bboxes)
|
144 |
+
for ind in range(bboxes.shape[0]):
|
145 |
+
label = self.dataloader.dataset.class_ids[int(cls[ind])]
|
146 |
+
pred_data = {
|
147 |
+
"image_id": int(img_id),
|
148 |
+
"category_id": label,
|
149 |
+
"bbox": bboxes[ind].numpy().tolist(),
|
150 |
+
"score": scores[ind].numpy().item(),
|
151 |
+
"segmentation": [],
|
152 |
+
} # COCO json format
|
153 |
+
data_list.append(pred_data)
|
154 |
+
if return_outputs:
|
155 |
+
return data_list, image_wise_data
|
156 |
+
return data_list
|
157 |
+
|
158 |
+
def evaluate_prediction(self, data_dict, statistics):
|
159 |
+
# if not is_main_process():
|
160 |
+
# return 0, 0, None
|
161 |
+
logger.info("Evaluate in main process...")
|
162 |
+
annType = ["segm", "bbox", "keypoints"]
|
163 |
+
inference_time = statistics[0]
|
164 |
+
nms_time = statistics[1]
|
165 |
+
n_samples = statistics[2]
|
166 |
+
a_infer_time = 1000 * inference_time / (n_samples * self.dataloader.batch_size)
|
167 |
+
a_nms_time = 1000 * nms_time / (n_samples * self.dataloader.batch_size)
|
168 |
+
time_info = ", ".join(
|
169 |
+
[
|
170 |
+
"Average {} time: {:.2f} ms".format(k, v)
|
171 |
+
for k, v in zip(
|
172 |
+
["forward", "NMS", "inference"],
|
173 |
+
[a_infer_time, a_nms_time, (a_infer_time + a_nms_time)],
|
174 |
+
)
|
175 |
+
]
|
176 |
+
)
|
177 |
+
info = time_info + "\n"
|
178 |
+
# Evaluate the Dt (detection) json comparing with the ground truth
|
179 |
+
if len(data_dict) > 0:
|
180 |
+
cocoGt = self.dataloader.dataset.coco
|
181 |
+
if self.testdev:
|
182 |
+
json.dump(data_dict, open("./yolox_testdev_2017.json", "w"))
|
183 |
+
cocoDt = cocoGt.loadRes("./yolox_testdev_2017.json")
|
184 |
+
else:
|
185 |
+
_, tmp = tempfile.mkstemp()
|
186 |
+
json.dump(data_dict, open(tmp, "w"))
|
187 |
+
cocoDt = cocoGt.loadRes(tmp)
|
188 |
+
logger.info("Use standard COCOeval.")
|
189 |
+
cocoEval = COCOeval(cocoGt, cocoDt, annType[1])
|
190 |
+
cocoEval.evaluate()
|
191 |
+
cocoEval.accumulate()
|
192 |
+
redirect_string = io.StringIO()
|
193 |
+
with contextlib.redirect_stdout(redirect_string):
|
194 |
+
cocoEval.summarize()
|
195 |
+
info += redirect_string.getvalue()
|
196 |
+
cat_ids = list(cocoGt.cats.keys())
|
197 |
+
cat_names = [cocoGt.cats[catId]['name'] for catId in sorted(cat_ids)]
|
198 |
+
if self.per_class_AP:
|
199 |
+
AP_table = per_class_AP_table(cocoEval, class_names=cat_names)
|
200 |
+
info += "per class AP:\n" + AP_table + "\n"
|
201 |
+
if self.per_class_AR:
|
202 |
+
AR_table = per_class_AR_table(cocoEval, class_names=cat_names)
|
203 |
+
info += "per class AR:\n" + AR_table + "\n"
|
204 |
+
return cocoEval.stats[0], cocoEval.stats[1], info
|
205 |
+
else:
|
206 |
+
return 0, 0, info
|
207 |
+
|
208 |
+
|
209 |
+
class ValTransform:
|
210 |
+
"""
|
211 |
+
Defines the transformations that should be applied to test PIL image
|
212 |
+
for input into the network
|
213 |
+
"""
|
214 |
+
|
215 |
+
def __init__(self, swap=(2, 0, 1), legacy=False):
|
216 |
+
self.swap = swap
|
217 |
+
self.legacy = legacy
|
218 |
+
|
219 |
+
# assume input is cv2 img for now
|
220 |
+
def __call__(self, img, res, input_size):
|
221 |
+
img, _ = preproc(img, input_size, self.swap)
|
222 |
+
if self.legacy:
|
223 |
+
img = img[::-1, :, :].copy()
|
224 |
+
img /= 255.0
|
225 |
+
img -= np.array([0.485, 0.456, 0.406]).reshape(3, 1, 1)
|
226 |
+
img /= np.array([0.229, 0.224, 0.225]).reshape(3, 1, 1)
|
227 |
+
return img, np.zeros((1, 5))
|
228 |
+
|
229 |
+
|
230 |
+
def preproc(img, input_size, swap=(2, 0, 1)):
|
231 |
+
"""Preprocess function for preparing input for the network"""
|
232 |
+
if len(img.shape) == 3:
|
233 |
+
padded_img = np.ones((input_size[0], input_size[1], 3), dtype=np.uint8) * 114
|
234 |
+
else:
|
235 |
+
padded_img = np.ones(input_size, dtype=np.uint8) * 114
|
236 |
+
r = min(input_size[0] / img.shape[0], input_size[1] / img.shape[1])
|
237 |
+
resized_img = cv2.resize(
|
238 |
+
img,
|
239 |
+
(int(img.shape[1] * r), int(img.shape[0] * r)),
|
240 |
+
interpolation=cv2.INTER_LINEAR,
|
241 |
+
).astype(np.uint8)
|
242 |
+
padded_img[: int(img.shape[0] * r), : int(img.shape[1] * r)] = resized_img
|
243 |
+
padded_img = padded_img.transpose(swap)
|
244 |
+
padded_img = np.ascontiguousarray(padded_img, dtype=np.float32)
|
245 |
+
return padded_img, r
|
246 |
+
|
247 |
+
|
248 |
+
def postprocess(prediction, num_classes, conf_thre=0.7, nms_thre=0.45, class_agnostic=False):
|
249 |
+
"""Post-processing part after the prediction heads with NMS"""
|
250 |
+
box_corner = prediction.new(prediction.shape)
|
251 |
+
box_corner[:, :, 0] = prediction[:, :, 0] - prediction[:, :, 2] / 2
|
252 |
+
box_corner[:, :, 1] = prediction[:, :, 1] - prediction[:, :, 3] / 2
|
253 |
+
box_corner[:, :, 2] = prediction[:, :, 0] + prediction[:, :, 2] / 2
|
254 |
+
box_corner[:, :, 3] = prediction[:, :, 1] + prediction[:, :, 3] / 2
|
255 |
+
prediction[:, :, :4] = box_corner[:, :, :4]
|
256 |
+
output = [None for _ in range(len(prediction))]
|
257 |
+
for i, image_pred in enumerate(prediction):
|
258 |
+
# If none are remaining => process next image
|
259 |
+
if not image_pred.size(0):
|
260 |
+
continue
|
261 |
+
# Get score and class with the highest confidence
|
262 |
+
class_conf, class_pred = torch.max(image_pred[:, 5: 5 + num_classes], 1, keepdim=True)
|
263 |
+
conf_mask = (image_pred[:, 4] * class_conf.squeeze() >= conf_thre).squeeze()
|
264 |
+
# Detections ordered as (x1, y1, x2, y2, obj_conf, class_conf, class_pred)
|
265 |
+
detections = torch.cat((image_pred[:, :5], class_conf, class_pred.float()), 1)
|
266 |
+
detections = detections[conf_mask]
|
267 |
+
if not detections.size(0):
|
268 |
+
continue
|
269 |
+
if class_agnostic:
|
270 |
+
nms_out_index = torchvision.ops.nms(
|
271 |
+
detections[:, :4],
|
272 |
+
detections[:, 4] * detections[:, 5],
|
273 |
+
nms_thre,
|
274 |
+
)
|
275 |
+
else:
|
276 |
+
nms_out_index = torchvision.ops.batched_nms(
|
277 |
+
detections[:, :4],
|
278 |
+
detections[:, 4] * detections[:, 5],
|
279 |
+
detections[:, 6],
|
280 |
+
nms_thre,
|
281 |
+
)
|
282 |
+
detections = detections[nms_out_index]
|
283 |
+
if output[i] is None:
|
284 |
+
output[i] = detections
|
285 |
+
else:
|
286 |
+
output[i] = torch.cat((output[i], detections))
|
287 |
+
return output
|
288 |
+
|
289 |
+
|
290 |
+
def head_postprocess(outputs, strides=[8, 16, 32]):
|
291 |
+
"""Decode outputs from predictions of the detection heads"""
|
292 |
+
hw = [x.shape[-2:] for x in outputs]
|
293 |
+
# [batch, n_anchors_all, 85]
|
294 |
+
outputs = torch.cat([x.flatten(start_dim=2) for x in outputs], dim=2).permute(0, 2, 1)
|
295 |
+
outputs[..., 4:] = outputs[..., 4:].sigmoid()
|
296 |
+
return decode_outputs(outputs, outputs[0].type(), hw, strides)
|
297 |
+
|
298 |
+
|
299 |
+
def decode_outputs(outputs, dtype, ori_hw, ori_strides):
|
300 |
+
grids = []
|
301 |
+
strides = []
|
302 |
+
for (hsize, wsize), stride in zip(ori_hw, ori_strides):
|
303 |
+
yv, xv = meshgrid([torch.arange(hsize), torch.arange(wsize)])
|
304 |
+
grid = torch.stack((xv, yv), 2).view(1, -1, 2)
|
305 |
+
grids.append(grid)
|
306 |
+
shape = grid.shape[:2]
|
307 |
+
strides.append(torch.full((*shape, 1), stride))
|
308 |
+
grids = torch.cat(grids, dim=1).type(dtype)
|
309 |
+
strides = torch.cat(strides, dim=1).type(dtype)
|
310 |
+
outputs[..., :2] = (outputs[..., :2] + grids) * strides
|
311 |
+
outputs[..., 2:4] = torch.exp(outputs[..., 2:4]) * strides
|
312 |
+
return outputs
|
313 |
+
|
314 |
+
|
315 |
+
def xyxy2xywh(bboxes):
|
316 |
+
bboxes[:, 2] = bboxes[:, 2] - bboxes[:, 0]
|
317 |
+
bboxes[:, 3] = bboxes[:, 3] - bboxes[:, 1]
|
318 |
+
return bboxes
|
319 |
+
|
320 |
+
|
321 |
+
def meshgrid(*tensors):
|
322 |
+
_TORCH_VER = [int(x) for x in torch.__version__.split(".")[:2]]
|
323 |
+
if _TORCH_VER >= [1, 10]:
|
324 |
+
return torch.meshgrid(*tensors, indexing="ij")
|
325 |
+
else:
|
326 |
+
return torch.meshgrid(*tensors)
|
327 |
+
|
328 |
+
|
329 |
+
def per_class_AR_table(coco_eval, class_names=COCO_CLASSES, headers=["class", "AR"], colums=6):
|
330 |
+
"""Format the recall of each class"""
|
331 |
+
per_class_AR = {}
|
332 |
+
recalls = coco_eval.eval["recall"]
|
333 |
+
# dimension of recalls: [TxKxAxM]
|
334 |
+
# recall has dims (iou, cls, area range, max dets)
|
335 |
+
assert len(class_names) == recalls.shape[1]
|
336 |
+
for idx, name in enumerate(class_names):
|
337 |
+
recall = recalls[:, idx, 0, -1]
|
338 |
+
recall = recall[recall > -1]
|
339 |
+
ar = np.mean(recall) if recall.size else float("nan")
|
340 |
+
per_class_AR[name] = float(ar * 100)
|
341 |
+
num_cols = min(colums, len(per_class_AR) * len(headers))
|
342 |
+
result_pair = [x for pair in per_class_AR.items() for x in pair]
|
343 |
+
row_pair = itertools.zip_longest(*[result_pair[i::num_cols] for i in range(num_cols)])
|
344 |
+
table_headers = headers * (num_cols // len(headers))
|
345 |
+
table = tabulate(
|
346 |
+
row_pair, tablefmt="pipe", floatfmt=".3f", headers=table_headers, numalign="left",
|
347 |
+
)
|
348 |
+
return table
|
349 |
+
|
350 |
+
|
351 |
+
def per_class_AP_table(coco_eval, class_names=COCO_CLASSES, headers=["class", "AP"], colums=6):
|
352 |
+
"""Format the precision of each class"""
|
353 |
+
per_class_AP = {}
|
354 |
+
precisions = coco_eval.eval["precision"]
|
355 |
+
# dimension of precisions: [TxRxKxAxM]
|
356 |
+
# precision has dims (iou, recall, cls, area range, max dets)
|
357 |
+
assert len(class_names) == precisions.shape[2]
|
358 |
+
for idx, name in enumerate(class_names):
|
359 |
+
# area range index 0: all area ranges
|
360 |
+
# max dets index -1: typically 100 per image
|
361 |
+
precision = precisions[:, :, idx, 0, -1]
|
362 |
+
precision = precision[precision > -1]
|
363 |
+
ap = np.mean(precision) if precision.size else float("nan")
|
364 |
+
per_class_AP[name] = float(ap * 100)
|
365 |
+
num_cols = min(colums, len(per_class_AP) * len(headers))
|
366 |
+
result_pair = [x for pair in per_class_AP.items() for x in pair]
|
367 |
+
row_pair = itertools.zip_longest(*[result_pair[i::num_cols] for i in range(num_cols)])
|
368 |
+
table_headers = headers * (num_cols // len(headers))
|
369 |
+
table = tabulate(
|
370 |
+
row_pair, tablefmt="pipe", floatfmt=".3f", headers=table_headers, numalign="left",
|
371 |
+
)
|
372 |
+
return table
|
373 |
+
|
374 |
+
|
375 |
+
def get_eval_loader(batch_size, test_size=(640, 640), data_dir='data/COCO', data_num_workers=0, testdev=False, legacy=False):
|
376 |
+
from coco import COCODataset
|
377 |
+
valdataset = COCODataset(
|
378 |
+
data_dir=data_dir,
|
379 |
+
json_file='instances_val2017.json' if not testdev else 'instances_test2017.json',
|
380 |
+
name="val2017" if not testdev else "test2017",
|
381 |
+
img_size=test_size,
|
382 |
+
preproc=ValTransform(legacy=legacy),
|
383 |
+
)
|
384 |
+
sampler = torch.utils.data.SequentialSampler(valdataset)
|
385 |
+
dataloader_kwargs = {
|
386 |
+
"num_workers": data_num_workers,
|
387 |
+
"pin_memory": True,
|
388 |
+
"sampler": sampler,
|
389 |
+
"batch_size": batch_size
|
390 |
+
}
|
391 |
+
val_loader = torch.utils.data.DataLoader(valdataset, **dataloader_kwargs)
|
392 |
+
return val_loader
|
393 |
+
|
394 |
+
|
395 |
+
def make_parser():
|
396 |
+
parser = argparse.ArgumentParser("onnxruntime inference sample")
|
397 |
+
parser.add_argument(
|
398 |
+
"-m",
|
399 |
+
"--model",
|
400 |
+
type=str,
|
401 |
+
default="yolox-s-int8.onnx",
|
402 |
+
help="Input your onnx model.",
|
403 |
+
)
|
404 |
+
parser.add_argument(
|
405 |
+
"-b",
|
406 |
+
"--batch_size",
|
407 |
+
type=int,
|
408 |
+
default=1,
|
409 |
+
help="Batch size for inference..",
|
410 |
+
)
|
411 |
+
parser.add_argument(
|
412 |
+
"--input_shape",
|
413 |
+
type=str,
|
414 |
+
default="640,640",
|
415 |
+
help="Specify an input shape for inference.",
|
416 |
+
)
|
417 |
+
parser.add_argument(
|
418 |
+
"--ipu",
|
419 |
+
action="store_true",
|
420 |
+
help="Use IPU for inference.",
|
421 |
+
)
|
422 |
+
parser.add_argument(
|
423 |
+
"--provider_config",
|
424 |
+
type=str,
|
425 |
+
default="vaip_config.json",
|
426 |
+
help="Path of the config file for setting provider_options.",
|
427 |
+
)
|
428 |
+
return parser
|
429 |
+
|
430 |
+
|
431 |
+
if __name__ == '__main__':
|
432 |
+
args = make_parser().parse_args()
|
433 |
+
input_shape = tuple(map(int, args.input_shape.split(',')))
|
434 |
+
if args.ipu:
|
435 |
+
providers = ["VitisAIExecutionProvider"]
|
436 |
+
provider_options = [{"config_file": args.provider_config}]
|
437 |
+
else:
|
438 |
+
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
|
439 |
+
provider_options = None
|
440 |
+
session = ort.InferenceSession(args.model, providers=providers, provider_options=provider_options)
|
441 |
+
val_loader = get_eval_loader(args.batch_size)
|
442 |
+
evaluator = COCOEvaluator(dataloader=val_loader, img_size=input_shape, confthre=0.01, nmsthre=0.65, num_classes=80, testdev=False)
|
443 |
+
*_, summary = evaluator.evaluate(session)
|
444 |
+
logger.info("\n" + summary)
|
infer_onnx.py
ADDED
@@ -0,0 +1,151 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env python3
|
2 |
+
# -*- coding: utf-8 -*-
|
3 |
+
|
4 |
+
import os
|
5 |
+
import sys
|
6 |
+
import cv2
|
7 |
+
import pathlib
|
8 |
+
import argparse
|
9 |
+
import numpy as np
|
10 |
+
import onnxruntime as ort
|
11 |
+
|
12 |
+
CURRENT_DIR = pathlib.Path(__file__).parent
|
13 |
+
sys.path.append(str(CURRENT_DIR))
|
14 |
+
|
15 |
+
from coco import COCO_CLASSES
|
16 |
+
from demo_utils import mkdir, multiclass_nms, demo_postprocess, vis
|
17 |
+
|
18 |
+
|
19 |
+
def make_parser():
|
20 |
+
parser = argparse.ArgumentParser("onnxruntime inference sample")
|
21 |
+
parser.add_argument(
|
22 |
+
"-m",
|
23 |
+
"--model",
|
24 |
+
type=str,
|
25 |
+
default="yolox-s-int8.onnx",
|
26 |
+
help="Input your onnx model.",
|
27 |
+
)
|
28 |
+
parser.add_argument(
|
29 |
+
"-i",
|
30 |
+
"--image_path",
|
31 |
+
type=str,
|
32 |
+
default='test_image.png',
|
33 |
+
help="Path to your input image.",
|
34 |
+
)
|
35 |
+
parser.add_argument(
|
36 |
+
"-o",
|
37 |
+
"--output_dir",
|
38 |
+
type=str,
|
39 |
+
default='demo_output',
|
40 |
+
help="Path to your output directory.",
|
41 |
+
)
|
42 |
+
parser.add_argument(
|
43 |
+
"-s",
|
44 |
+
"--score_thr",
|
45 |
+
type=float,
|
46 |
+
default=0.3,
|
47 |
+
help="Score threshold to filter the result.",
|
48 |
+
)
|
49 |
+
parser.add_argument(
|
50 |
+
"--input_shape",
|
51 |
+
type=str,
|
52 |
+
default="640,640",
|
53 |
+
help="Specify an input shape for inference.",
|
54 |
+
)
|
55 |
+
parser.add_argument(
|
56 |
+
"--ipu",
|
57 |
+
action="store_true",
|
58 |
+
help="Use IPU for inference.",
|
59 |
+
)
|
60 |
+
parser.add_argument(
|
61 |
+
"--provider_config",
|
62 |
+
type=str,
|
63 |
+
default="vaip_config.json",
|
64 |
+
help="Path of the config file for setting provider_options.",
|
65 |
+
)
|
66 |
+
return parser
|
67 |
+
|
68 |
+
|
69 |
+
def preprocess(img, input_shape, swap=(2, 0, 1)):
|
70 |
+
"""
|
71 |
+
Preprocessing part of YOLOX for scaling and padding image as input to the network.
|
72 |
+
|
73 |
+
Args:
|
74 |
+
img (numpy.ndarray): H x W x C, image read with OpenCV
|
75 |
+
input_shape (tuple(int)): input shape of the network for inference
|
76 |
+
swap (tuple(int)): new order of axes to transpose the input image
|
77 |
+
|
78 |
+
Returns:
|
79 |
+
padded_img (numpy.ndarray): preprocessed image to be fed to the network
|
80 |
+
ratio (float): ratio for scaling the image to the input shape
|
81 |
+
"""
|
82 |
+
if len(img.shape) == 3:
|
83 |
+
padded_img = np.ones((input_shape[0], input_shape[1], 3), dtype=np.uint8) * 114
|
84 |
+
else:
|
85 |
+
padded_img = np.ones(input_shape, dtype=np.uint8) * 114
|
86 |
+
ratio = min(input_shape[0] / img.shape[0], input_shape[1] / img.shape[1])
|
87 |
+
resized_img = cv2.resize(
|
88 |
+
img,
|
89 |
+
(int(img.shape[1] * ratio), int(img.shape[0] * ratio)),
|
90 |
+
interpolation=cv2.INTER_LINEAR,
|
91 |
+
).astype(np.uint8)
|
92 |
+
padded_img[: int(img.shape[0] * ratio), : int(img.shape[1] * ratio)] = resized_img
|
93 |
+
padded_img = padded_img.transpose(swap)
|
94 |
+
padded_img = np.ascontiguousarray(padded_img, dtype=np.float32)
|
95 |
+
return padded_img, ratio
|
96 |
+
|
97 |
+
|
98 |
+
def postprocess(outputs, input_shape, ratio):
|
99 |
+
"""
|
100 |
+
Post-processing part of YOLOX for generating final results from outputs of the network.
|
101 |
+
|
102 |
+
Args:
|
103 |
+
outputs (tuple(numpy.ndarray)): outputs of the detection heads with onnxruntime session
|
104 |
+
input_shape (tuple(int)): input shape of the network for inference
|
105 |
+
ratio (float): ratio for scaling the image to the input shape
|
106 |
+
|
107 |
+
Returns:
|
108 |
+
dets (numpy.ndarray): n x 6, dets[:,:4] -> boxes, dets[:,4] -> scores, dets[:,5] -> class indices
|
109 |
+
"""
|
110 |
+
outputs = [out.reshape(*out.shape[:2], -1).transpose(0,2,1) for out in outputs]
|
111 |
+
outputs = np.concatenate(outputs, axis=1)
|
112 |
+
outputs[..., 4:] = sigmoid(outputs[..., 4:])
|
113 |
+
predictions = demo_postprocess(outputs, input_shape, p6=False)[0]
|
114 |
+
boxes = predictions[:, :4]
|
115 |
+
scores = predictions[:, 4:5] * predictions[:, 5:]
|
116 |
+
boxes_xyxy = np.ones_like(boxes)
|
117 |
+
boxes_xyxy[:, 0] = boxes[:, 0] - boxes[:, 2]/2.
|
118 |
+
boxes_xyxy[:, 1] = boxes[:, 1] - boxes[:, 3]/2.
|
119 |
+
boxes_xyxy[:, 2] = boxes[:, 0] + boxes[:, 2]/2.
|
120 |
+
boxes_xyxy[:, 3] = boxes[:, 1] + boxes[:, 3]/2.
|
121 |
+
boxes_xyxy /= ratio
|
122 |
+
dets = multiclass_nms(boxes_xyxy, scores, nms_thr=0.45, score_thr=0.1)
|
123 |
+
return dets
|
124 |
+
|
125 |
+
|
126 |
+
def sigmoid(x):
|
127 |
+
return 1.0 / (1.0 + np.exp(-x))
|
128 |
+
|
129 |
+
|
130 |
+
if __name__ == '__main__':
|
131 |
+
args = make_parser().parse_args()
|
132 |
+
input_shape = tuple(map(int, args.input_shape.split(',')))
|
133 |
+
origin_img = cv2.imread(args.image_path)
|
134 |
+
img, ratio = preprocess(origin_img, input_shape)
|
135 |
+
if args.ipu:
|
136 |
+
providers = ["VitisAIExecutionProvider"]
|
137 |
+
provider_options = [{"config_file": args.provider_config}]
|
138 |
+
else:
|
139 |
+
providers = ['CUDAExecutionProvider', 'CPUExecutionProvider']
|
140 |
+
provider_options = None
|
141 |
+
session = ort.InferenceSession(args.model, providers=providers, provider_options=provider_options)
|
142 |
+
ort_inputs = {session.get_inputs()[0].name: img[None, :, :, :]}
|
143 |
+
outputs = session.run(None, ort_inputs)
|
144 |
+
dets = postprocess(outputs, input_shape, ratio)
|
145 |
+
if dets is not None:
|
146 |
+
final_boxes, final_scores, final_cls_inds = dets[:, :4], dets[:, 4], dets[:, 5]
|
147 |
+
origin_img = vis(origin_img, final_boxes, final_scores, final_cls_inds,
|
148 |
+
conf=args.score_thr, class_names=COCO_CLASSES)
|
149 |
+
mkdir(args.output_dir)
|
150 |
+
output_path = os.path.join(args.output_dir, os.path.basename(args.image_path))
|
151 |
+
cv2.imwrite(output_path, origin_img)
|
requirements.txt
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
torch>=1.12.0
|
2 |
+
torchvision>=0.13.0
|
3 |
+
opencv_python
|
4 |
+
numpy
|
5 |
+
loguru
|
6 |
+
tqdm
|
7 |
+
tabulate
|
8 |
+
pycocotools>=2.0.2
|
9 |
+
# onnxruntime
|
yolox-s-int8.onnx
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:87154c9d3bd7ce411b03e2ff7c124a6f2f8bf2b6191049d633d2332659fb0d41
|
3 |
+
size 35988727
|