Performance Enhancement of YOLOv3 by Adding Prediction Layers with Spatial Pyramid Pooling for Vehicle Detection
Performance Enhancement of YOLOv3 by Adding Prediction Layers with Spatial Pyramid Pooling for Vehicle Detection
Kwang-Ju Kim,Pyong-Kun Kim,Yun-Su Chung,Doo-Hyun Choi
Abstract
In recent years, vision-based object detection methods using convolutional neural network (CNN) have been very successful. However, the object detection method using the CNN feature has a disadvantage that lots of feature maps should be generated in order to be robust against the scale change and the occlusion of the object. Also, simply raising a large number of feature maps does not improve performance. We propose a multi-scale vehicle detection with spatial pyramid pooling method which is robust to the scale change of the vehicle and the occlusion by improving the conventional YOLOv3 algorithm. The proposed method was evaluated through the UA-DETRAC benchmark and obtain the state-of-the-art mAP, which is better than those of the DPM, ACF, R-CNN, CompACT, NANO, SA-FRCNN, and Faster-RCNN2.
