Upper Body Pose Estimation with Temporal Sequential Forests

James Charles, Tomas Pfister, Derek Magee, David Hogg and Andrew Zisserman

In Proceedings British Machine Vision Conference 2014
http://dx.doi.org/10.5244/C.28.54

Abstract

Our objective is to efficiently and accurately estimate human upper body pose in gesture videos. To this end, we build on the recent successful applications of random forests (RF) classifiers and regressors, and develop a pose estimation model with the following novelties: (i) the joints are estimated sequentially, taking account of the human kinematic chain. This means that we don't have to make the simplifying assumption of most previous RF methods -- that the joints are estimated independently; (ii) by combining both classifiers (as a mixture of experts) and regressors, we show that the learning problem is tractable and that more context can be taken into account; and (iii) dense optical flow is used to align multiple expert joint position proposals from nearby frames, and thereby improve the robustness of the estimates. The resulting method is computationally efficient and can overcome a number of the errors (e.g. confusing left/right hands) made by RF pose estimators that infer their locations independently. We show that we improve over the state of the art on upper body pose estimation for two public datasets: the BBC TV Signing dataset and the ChaLearn Gesture Recognition dataset.

Session

Poster Session

Files

Extended Abstract (PDF, 1 page, 573K)

Paper (PDF, 12 pages, 1.1M)

Bibtex File

Citation

James Charles, Tomas Pfister, Derek Magee, David Hogg, and Andrew  Zisserman. Upper Body Pose Estimation with Temporal Sequential Forests. Proceedings of the British Machine Vision Conference. BMVA Press, September 2014.

BibTex

@inproceedings{BMVC.28.54
	title = {Upper Body Pose Estimation with Temporal Sequential Forests},
	author = {Charles, James and Pfister, Tomas and Magee, Derek and Hogg, David and Zisserman, Andrew},
	year = {2014},
	booktitle = {Proceedings of the British Machine Vision Conference},
	publisher = {BMVA Press},
	editors = {Valstar, Michel and French, Andrew and Pridmore, Tony}
	doi = { http://dx.doi.org/10.5244/C.28.54 }
}