Human Pose as Context for Object Detection

Abhilash Srikantha and Juergen Gall

Abstract

Detecting small objects in images is a challenging problem particularly when they are often occluded by hands or other body parts. Recently, joint modelling of human pose and objects has been proposed to improve both pose estimation as well as object detection. These approaches, however, focus on explicit interaction with an object and lack the flexibility to combine both modalities when interaction is not obvious. We therefore propose to use human pose as an additional context information for object detection. To this end, we represent an object category by a tree model and train regression forests that localize parts of an object for each modality separately. Predictions of the two modalities are then combined to detect the bounding box of the object. We evaluate our approach on three challenging datasets which vary in the amount of object interactions and the quality of automatically extracted human poses.

Session

Poster 2

Files

PDF iconExtended Abstract (PDF, 5M)
PDF iconPaper (PDF, 6M)

DOI

10.5244/C.29.101
https://dx.doi.org/10.5244/C.29.101

Citation

Abhilash Srikantha and Juergen Gall. Human Pose as Context for Object Detection. In Xianghua Xie, Mark W. Jones, and Gary K. L. Tam, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 101.1-101.11. BMVA Press, September 2015.

Bibtex

@inproceedings{BMVC2015_101,
	title={Human Pose as Context for Object Detection},
	author={Abhilash Srikantha and Juergen Gall},
	year={2015},
	month={September},
	pages={101.1-101.11},
	articleno={101},
	numpages={11},
	booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
	publisher={BMVA Press},
	editor={Xianghua Xie, Mark W. Jones, and Gary K. L. Tam},
	doi={10.5244/C.29.101},
	isbn={1-901725-53-7},
	url={https://dx.doi.org/10.5244/C.29.101}
}