Detecting Parts for Action Localization

Nicolas Chesneau, Gregory Rogez, Karteek Alahari and Cordelia Schmid

Abstract

In this paper, we propose a new framework for action localization that tracks people in videos and extracts full-body human tubes, i.e., spatio-temporal regions localizing actions, even in the case of occlusions or truncations. This is achieved by training a novel human part detector that scores visible parts while regressing full-body bounding boxes. The core of our method is a convolutional neural network which learns part proposals speciļ¬c to certain body parts. These are then combined to detect people robustly in each frame. Our tracking algorithm connects the image detections temporally to extract full-body human tubes.

Session

Posters

Files

PDF iconPaper (PDF)

DOI

10.5244/C.31.51
https://dx.doi.org/10.5244/C.31.51

Citation

Nicolas Chesneau, Gregory Rogez, Karteek Alahari and Cordelia Schmid. Detecting Parts for Action Localization. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 51.1-51.12. BMVA Press, September 2017.

Bibtex

            @inproceedings{BMVC2017_51,
                title={Detecting Parts for Action Localization},
                author={Nicolas Chesneau, Gregory Rogez, Karteek Alahari and Cordelia Schmid},
                year={2017},
                month={September},
                pages={51.1-51.12},
                articleno={51},
                numpages={12},
                booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
                publisher={BMVA Press},
                editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
                doi={10.5244/C.31.51},
                isbn={1-901725-60-X},
                url={https://dx.doi.org/10.5244/C.31.51}
            }