AutoScaler: Scale-Attention Networks for Visual Correspondence

Shenlong Wang, Linjie Luo, Ning Zhang and Jia Li

Abstract

Finding visual correspondence between local features is key to many computer vision problems. While deﬁning features with larger contextual scales usually implies greater discriminativeness, it could also lead to less spatial accuracy of the features. We propose AutoScaler, a scale-attention network to explicitly optimize this trade-off in visual correspondence tasks. Our architecture consists of a weight-sharing feature network to compute multi-scale feature maps and an attention network to combine them optimally in the scale space. This allows our network to have adaptive sizes of equivalent receptive ﬁeld over different scales of the input. The entire network can be trained end-to-end in a Siamese framework for visual correspondence tasks. Using the latest off-the-shelf architecture for the feature network, our method achieves competitive results compared to state-of-the-art methods on challenging optical ﬂow and semantic matching benchmarks, including Sintel, KITTI and CUB-2011. We also show that our attention network alone can be applied to existing hand-crafted feature descriptors (e.g Daisy) and improve their performance on visual correspondence tasks.

Session

Orals - Matching

Files

Paper (PDF)

DOI

10.5244/C.31.185
https://dx.doi.org/10.5244/C.31.185

Citation

Shenlong Wang, Linjie Luo, Ning Zhang and Jia Li. AutoScaler: Scale-Attention Networks for Visual Correspondence. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 185.1-185.13. BMVA Press, September 2017.

Bibtex

            @inproceedings{BMVC2017_185,
                title={AutoScaler: Scale-Attention Networks for Visual Correspondence},
                author={Shenlong Wang, Linjie Luo, Ning Zhang and Jia Li},
                year={2017},
                month={September},
                pages={185.1-185.13},
                articleno={185},
                numpages={13},
                booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
                publisher={BMVA Press},
                editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
                doi={10.5244/C.31.185},
                isbn={1-901725-60-X},
                url={https://dx.doi.org/10.5244/C.31.185}
            }