Sparse Deep Feature Representation for Object Detection from Wearable Cameras

Quanfu Fan and Richard Chen

Abstract

We propose a novel sparse feature representation for the faster RCNN framework and apply it for object detection from wearable cameras. Two main ideas, sparse convolution and sparse ROI pooling, are developed to reduce model complexity as well as computational cost. Sparse convolution approximates a full kernel by skipping weights in the kernel while sparse ROI pooling performs feature dimensionality reduction on the ROI pooling layer by skipping odd-indexed or even-indexed features. We demonstrate the effectiveness of our approach on two challenging body camera datasets including realistic police-generated clips.

Session

Posters

Files

Paper (PDF)

Supplementary (PDF)

DOI

10.5244/C.31.163
https://dx.doi.org/10.5244/C.31.163

Citation

Quanfu Fan and Richard Chen. Sparse Deep Feature Representation for Object Detection from Wearable Cameras. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 163.1-163.12. BMVA Press, September 2017.

Bibtex

            @inproceedings{BMVC2017_163,
                title={Sparse Deep Feature Representation for Object Detection from Wearable Cameras},
                author={Quanfu Fan and Richard Chen},
                year={2017},
                month={September},
                pages={163.1-163.12},
                articleno={163},
                numpages={12},
                booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
                publisher={BMVA Press},
                editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
                doi={10.5244/C.31.163},
                isbn={1-901725-60-X},
                url={https://dx.doi.org/10.5244/C.31.163}
            }