Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors

Matthew Trumble, Andrew Gilbert, Charles Malleson, Adrian Hilton and John Collomosse

Abstract

We present an algorithm for fusing multi-viewpoint video (MVV) with inertial measurement unit (IMU) sensor data to accurately estimate 3D human pose. A 3-D convolutional neural network is used to learn a pose embedding from volumetric probabilistic visual hull data (PVH) derived from the MVV frames. We incorporate this model within a dual stream network integrating pose embeddings derived from MVV and a forward kinematic solve of the IMU data. A temporal model (LSTM) is incorporated within both streams prior to their fusion. Hybrid pose inference using these two complementary data sources is shown to resolve ambiguities within each sensor modality, yielding improved accuracy over prior methods. A further contribution of this work is a new hybrid MVV dataset (TotalCapture) comprising video, IMU and a skeletal joint ground truth derived from a commercial motion capture system. The dataset is available online at http://cvssp.org/data/totalcapture/.

Session

Orals - Pose Estimation

Files

PDF iconPaper (PDF)
PDF iconSupplementary (PDF)

DOI

10.5244/C.31.14
https://dx.doi.org/10.5244/C.31.14

Citation

Matthew Trumble, Andrew Gilbert, Charles Malleson, Adrian Hilton and John Collomosse. Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 14.1-14.13. BMVA Press, September 2017.

Bibtex

            @inproceedings{BMVC2017_14,
                title={Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors},
                author={Matthew Trumble, Andrew Gilbert, Charles Malleson, Adrian Hilton and John Collomosse},
                year={2017},
                month={September},
                pages={14.1-14.13},
                articleno={14},
                numpages={13},
                booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
                publisher={BMVA Press},
                editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
                doi={10.5244/C.31.14},
                isbn={1-901725-60-X},
                url={https://dx.doi.org/10.5244/C.31.14}
            }