Enhancement of SSD by concatenating feature maps for object detection
Jisoo Jeong, Hyojin Park and Nojun Kwak
Abstract
We propose an object detection method that improves the accuracy of the conventional SSD (Single Shot Multibox Detector), which is one of the top object detection
algorithms in both aspects of accuracy and speed. The performance of a deep network is
known to be improved as the number of feature maps increases. However, it is difficult to
improve the performance by simply raising the number of feature maps. In this paper, we
propose and analyze how to use feature maps effectively to improve the performance of
the conventional SSD. The enhanced performance was obtained by changing the structure close to the classifier network, rather than growing layers close to the input data,
e.g., by replacing VGGNet with ResNet. The proposed network is suitable for sharing
the weights in the classifier networks, by which property, the training can be faster with
better generalization power. For the Pascal VOC 2007 test set trained with VOC 2007 and
VOC 2012 training sets, the proposed network with the input size of 300× 300 achieved
78.5% mAP (mean average precision) at the speed of 35.0 FPS (frame per second), while
the network with a 512× 512 sized input achieved 80.8% mAP at 16.6 FPS using Nvidia
Titan X GPU. The proposed network shows state-of-the-art mAP, which is better than
those of the conventional SSD, YOLO, Faster-RCNN and RFCN.
Session
Posters
Files
Paper (PDF)
DOI
10.5244/C.31.76
https://dx.doi.org/10.5244/C.31.76
Citation
Jisoo Jeong, Hyojin Park and Nojun Kwak. Enhancement of SSD by concatenating feature maps for object detection. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 76.1-76.12. BMVA Press, September 2017.
Bibtex
@inproceedings{BMVC2017_76,
title={Enhancement of SSD by concatenating feature maps for object detection},
author={Jisoo Jeong, Hyojin Park and Nojun Kwak},
year={2017},
month={September},
pages={76.1-76.12},
articleno={76},
numpages={12},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
doi={10.5244/C.31.76},
isbn={1-901725-60-X},
url={https://dx.doi.org/10.5244/C.31.76}
}