Weakly-supervised Learning of Mid-level Features for Pedestrian Attribute Recognition and Localization
Yang Zhou, Kai Yu, Biao Leng, zhang Zhang, Dangwei Li and Kaiqi Huang
Abstract
Most existing methods for pedestrian attribute recognition in video surveillance can
be formulated as a multi-label image classification methodology, while attribute localization is usually disregarded due to the low image qualities and large variations of camera
viewpoints and human poses. In this paper, we propose a weakly-supervised learning
based approaching to implementing multi-attribute classification and localization simultaneously, without the need of bounding box annotations of attributes. Firstly, a set of
mid-level attribute features are discovered by a multi-scale attribute-aware module receiving the outputs of multiple inception layers in a deep Convolution Neural Network
(CNN) e.g., GoogLeNet, where a Flexible Spatial Pyramid Pooling (FSPP) operation is
performed to acquire the activation maps of attribute features. Subsequently, attribute labels are predicted through a fully-connected layer which performs the regression between
the response magnitudes in activation maps and the image-level attribute annotations.
Finally, the locations of pedestrian attributes can be inferred by fusing the multiple activation maps, where the fusion weights are estimated as the correlation strengths between
attributes and relevant mid-level features. To validate the proposed approach, extensive
experiments are performed on the two currently largest pedestrian attribute datasets, i.e. the PETA dataset [4] and the RAP dataset [10]. In comparison with other state-of-theart methods, competitive performance on attribute classification can be achieved. The additional capability of attribute localization is also evaluated.
Session
Posters
Files
Paper (PDF)
Supplementary (PDF)
DOI
10.5244/C.31.69
https://dx.doi.org/10.5244/C.31.69
Citation
Yang Zhou, Kai Yu, Biao Leng, zhang Zhang, Dangwei Li and Kaiqi Huang. Weakly-supervised Learning of Mid-level Features for Pedestrian Attribute Recognition and Localization. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 69.1-69.12. BMVA Press, September 2017.
Bibtex
@inproceedings{BMVC2017_69,
title={Weakly-supervised Learning of Mid-level Features for Pedestrian Attribute Recognition and Localization},
author={Yang Zhou, Kai Yu, Biao Leng, zhang Zhang, Dangwei Li and Kaiqi Huang},
year={2017},
month={September},
pages={69.1-69.12},
articleno={69},
numpages={12},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
doi={10.5244/C.31.69},
isbn={1-901725-60-X},
url={https://dx.doi.org/10.5244/C.31.69}
}