Object-Extent Pooling for Weakly Supervised Single-Shot Localization

Amogh Gudi, Nicolai van Rosmalen, Marco Loog and Jan van Gemert

Abstract

In the face of scarcity in detailed training annotations, the ability to perform object localization tasks in real-time with weak-supervision is very valuable. However, the computational cost of generating and evaluating region proposals is heavy. We adapt the concept of Class Activation Maps (CAM) [28] into the very first weakly-supervised ‘single-shot’ detector that does not require the use of region proposals. To facilitate this, we propose a novel global pooling technique called Spatial Pyramid Averaged Max (SPAM) pooling for training this CAM-based network for object extent localisation with only weak image-level supervision. We show this global pooling layer possesses a near ideal flow of gradients for extent localization, that offers a good trade-off between the extremes of max and average pooling. Our approach only requires a single network pass and uses a fast-backprojection technique, completely omitting any region proposal steps. To the best of our knowledge, this is the first approach to do so.

Session

Posters

Files

PDF iconPaper (PDF)
PDF iconSupplementary (PDF)

DOI

10.5244/C.31.36
https://dx.doi.org/10.5244/C.31.36

Citation

Amogh Gudi, Nicolai van Rosmalen, Marco Loog and Jan van Gemert. Object-Extent Pooling for Weakly Supervised Single-Shot Localization. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 36.1-36.12. BMVA Press, September 2017.

Bibtex

            @inproceedings{BMVC2017_36,
                title={Object-Extent Pooling for Weakly Supervised Single-Shot Localization},
                author={Amogh Gudi, Nicolai van Rosmalen, Marco Loog and Jan van Gemert},
                year={2017},
                month={September},
                pages={36.1-36.12},
                articleno={36},
                numpages={12},
                booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
                publisher={BMVA Press},
                editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
                doi={10.5244/C.31.36},
                isbn={1-901725-60-X},
                url={https://dx.doi.org/10.5244/C.31.36}
            }