Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation

Zhenheng Yang, Jiyang Gao and Ram Nevatia

Abstract

In this work, we address the problem of spatio-temporal action detection in temporally untrimmed videos. It is an important and challenging task as finding accurate human actions in both temporal and spatial space is important for analyzing large-scale video data. To tackle this problem, we propose a cascade proposal and location anticipation (CPLA) model for frame-level action detection. There are several salient points of our model: (1) a cascade region proposal network (casRPN) is adopted for action proposal generation and shows better localization accuracy compared with single region proposal network (RPN); (2) action spatio-temporal consistencies are exploited via a location anticipation network (LAN) and thus frame-level action detection is not conducted independently. Frame-level detections are then linked by solving an linking score maximization problem, and temporally trimmed into spatio-temporal action tubes.

Session

Orals - Action Recognition

Files

PDF iconPaper (PDF)
PDF iconSupplementary (PDF)

DOI

10.5244/C.31.95
https://dx.doi.org/10.5244/C.31.95

Citation

Zhenheng Yang, Jiyang Gao and Ram Nevatia. Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 95.1-95.12. BMVA Press, September 2017.

Bibtex

            @inproceedings{BMVC2017_95,
                title={Spatio-Temporal Action Detection with Cascade Proposal and Location Anticipation},
                author={Zhenheng Yang, Jiyang Gao and Ram Nevatia},
                year={2017},
                month={September},
                pages={95.1-95.12},
                articleno={95},
                numpages={12},
                booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
                publisher={BMVA Press},
                editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
                doi={10.5244/C.31.95},
                isbn={1-901725-60-X},
                url={https://dx.doi.org/10.5244/C.31.95}
            }