Attribute Embedding with Visual-Semantic Ambiguity Removal for Zero-shot Learning

Yang Long, Li Liu and Ling Shao

Abstract

Conventional zero-shot learning (ZSL) methods recognise an unseen instance by projecting its visual features to a semantic space that is shared by both seen and unseen categories. However, we observe that such a one-way paradigm suffers from the visual-semantic ambiguity problem. Namely, the semantic concepts (e.g. attributes) cannot explicitly correspond to visual patterns, and vice versa. Such a problem can lead to a huge variance in the visual features for each attribute. In this paper, we investigate how to remove such semantic ambiguity based on the observed visual appearances. In particular, we propose (1) a novel latent attribute space to mitigate the gap between visual appearances and semantic expressions; (2) a dual-graph regularised embedding algorithm called Visual-Semantic Ambiguity Removal (VSAR) that can simultaneously extract the shared components between visual and semantic information and mutually align the data distribution based on the intrinsic local structures of both spaces; (3) a new zero-shot recognition framework that can deal with both instance-level and category-level ZSL tasks. We validate our method on two popular zero-shot learning datasets, AwA and aPY. Extensive experiments demonstrate that our proposed approach significantly outperforms the state-of-the-art methods.

Session

Posters 1

Files

Extended Abstract (PDF, 4M)

Paper (PDF, 7M)

DOI

10.5244/C.30.40
https://dx.doi.org/10.5244/C.30.40

Citation

Yang Long, Li Liu and Ling Shao. Attribute Embedding with Visual-Semantic Ambiguity Removal for Zero-shot Learning. In Richard C. Wilson, Edwin R. Hancock and William A. P. Smith, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 40.1-40.11. BMVA Press, September 2016.

Bibtex

        @inproceedings{BMVC2016_40,
        	title={Attribute Embedding with Visual-Semantic Ambiguity Removal for Zero-shot Learning},
        	author={Yang Long, Li Liu and Ling Shao},
        	year={2016},
        	month={September},
        	pages={40.1-40.11},
        	articleno={40},
        	numpages={11},
        	booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
        	publisher={BMVA Press},
        	editor={Richard C. Wilson, Edwin R. Hancock and William A. P. Smith},
        	doi={10.5244/C.30.40},
        	isbn={1-901725-59-6},
        	url={https://dx.doi.org/10.5244/C.30.40}
        }