Deep Fishing: Gradient Features from Deep Nets

Albert Gordo, Adrien Gaidon and Florent Perronnin

Abstract

Convolutional Networks (ConvNets) have recently improved image recognition performance thanks to end-to-end learning of deep feed-forward models from raw pixels. Deep learning is a marked departure from the previous state of the art, the Fisher Vector (FV), which relied on gradient-based encoding of local hand-crafted features. In this paper, we discuss a novel connection between these two approaches. First, we show that one can derive gradient representations from ConvNets in a similar fashion to the FV. Second, we show that this gradient representation actually corresponds to a structured matrix that allows for efficient similarity computation. We experimentally study the benefits of transferring this representation over the outputs of ConvNet layers, and find consistent improvements on the Pascal VOC 2007 and 2012 datasets.

Session

Poster 2

Files

PDF iconExtended Abstract (PDF, 1736K)
PDF iconPaper (PDF, 1915K)

DOI

10.5244/C.29.111
https://dx.doi.org/10.5244/C.29.111

Citation

Albert Gordo, Adrien Gaidon and Florent Perronnin. Deep Fishing: Gradient Features from Deep Nets. In Xianghua Xie, Mark W. Jones, and Gary K. L. Tam, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 111.1-111.12. BMVA Press, September 2015.

Bibtex

@inproceedings{BMVC2015_111,
	title={Deep Fishing: Gradient Features from Deep Nets},
	author={Albert Gordo and Adrien Gaidon and Florent Perronnin},
	year={2015},
	month={September},
	pages={111.1-111.12},
	articleno={111},
	numpages={12},
	booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
	publisher={BMVA Press},
	editor={Xianghua Xie, Mark W. Jones, and Gary K. L. Tam},
	doi={10.5244/C.29.111},
	isbn={1-901725-53-7},
	url={https://dx.doi.org/10.5244/C.29.111}
}