Learning Discriminative Visual N-grams from Mid-level Image Features

Raj Kumar Gupta, Megha Pandey and Alex YS Chia

Abstract

The task of image classification is one of the key problems in computer vision, and has inspired a variety of image representations. In this paper, we propose a method to learn discriminative combinations of mid-level visual elements that capture their spatial configurations and co-occurrence relationships. We term such combinations as visual n-grams. Our method is capable of learning combinations with different number of elements. Experiments conducted on multiple datasets demonstrate the effectiveness of our approach where we achieve high image classification accuracy. Further, on fusing our features with global image features, we outperform the state-of-the-art results.

Session

Poster 1

Files

PDF iconExtended Abstract (PDF, 298K)
PDF iconPaper (PDF, 4M)

DOI

10.5244/C.29.47
https://dx.doi.org/10.5244/C.29.47

Citation

Raj Kumar Gupta, Megha Pandey and Alex YS Chia. Learning Discriminative Visual N-grams from Mid-level Image Features. In Xianghua Xie, Mark W. Jones, and Gary K. L. Tam, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 47.1-47.12. BMVA Press, September 2015.

Bibtex

@inproceedings{BMVC2015_47,
	title={Learning Discriminative Visual N-grams from Mid-level Image Features},
	author={Raj Kumar Gupta and Megha Pandey and Alex YS Chia},
	year={2015},
	month={September},
	pages={47.1-47.12},
	articleno={47},
	numpages={12},
	booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
	publisher={BMVA Press},
	editor={Xianghua Xie, Mark W. Jones, and Gary K. L. Tam},
	doi={10.5244/C.29.47},
	isbn={1-901725-53-7},
	url={https://dx.doi.org/10.5244/C.29.47}
}