Real-Time Temporal Action Localization in Untrimmed Videos by Sub-Action Discovery
Rui Hou, Rahul Sukthankar and Mubarak Shah
Abstract
This paper presents a computationally efficient approach for temporal action detection in untrimmed videos that outperforms state-of-the-art methods by a large margin.
We exploit the temporal structure of actions by modeling an action as a sequence of
sub-actions. A novel and fully automatic sub-action discovery algorithm is proposed,
where the number of sub-actions for each action as well as their types are automatically
determined from the training videos. We find that the discovered sub-actions are semantically meaningful. To localize an action, an objective function combining appearance,
duration and temporal structure of sub-actions is optimized as a shortest path problem
in a network flow formulation. A significant benefit of the proposed approach is that
it enables real-time action localization (40 fps) in untrimmed videos.
Session
Orals - Action Recognition
Files
Paper (PDF)
Supplementary (PDF)
DOI
10.5244/C.31.91
https://dx.doi.org/10.5244/C.31.91
Citation
Rui Hou, Rahul Sukthankar and Mubarak Shah. Real-Time Temporal Action Localization in Untrimmed Videos by Sub-Action Discovery. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 91.1-91.12. BMVA Press, September 2017.
Bibtex
@inproceedings{BMVC2017_91,
title={Real-Time Temporal Action Localization in Untrimmed Videos by Sub-Action Discovery},
author={Rui Hou, Rahul Sukthankar and Mubarak Shah},
year={2017},
month={September},
pages={91.1-91.12},
articleno={91},
numpages={12},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
doi={10.5244/C.31.91},
isbn={1-901725-60-X},
url={https://dx.doi.org/10.5244/C.31.91}
}