Multiple Instance Visual-Semantic Embedding

Zhou Ren, Hailin Jin, Zhe Lin, Chen Fang and Alan Yuille

Abstract

Visual-semantic embedding models have been recently proposed and shown to be effective for image classiﬁcation and zero-shot learning. The key idea is that by directly learning a mapping from images into a semantic label space, the algorithm can generalize to a large number of unseen labels. However, existing approaches are limited to single-label embedding, handling images with multiple labels still remains an open problem, mainly due to the complex underlying correspondence between an image and its labels. In this work, we present a novel Multiple Instance Visual-Semantic Embedding (MIVSE) model for multi-label images. Instead of embedding a whole image into the semantic space, our model characterizes the subregion-to-label correspondence, which discovers and maps semantically meaningful image subregions to the corresponding labels.

Session

Orals - Scene Understanding

Files

Paper (PDF)

Supplementary (PDF)

DOI

10.5244/C.31.89
https://dx.doi.org/10.5244/C.31.89

Citation

Zhou Ren, Hailin Jin, Zhe Lin, Chen Fang and Alan Yuille. Multiple Instance Visual-Semantic Embedding. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 89.1-89.12. BMVA Press, September 2017.

Bibtex

            @inproceedings{BMVC2017_89,
                title={Multiple Instance Visual-Semantic Embedding},
                author={Zhou Ren, Hailin Jin, Zhe Lin, Chen Fang and Alan Yuille},
                year={2017},
                month={September},
                pages={89.1-89.12},
                articleno={89},
                numpages={12},
                booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
                publisher={BMVA Press},
                editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
                doi={10.5244/C.31.89},
                isbn={1-901725-60-X},
                url={https://dx.doi.org/10.5244/C.31.89}
            }