SilNet : Single- and Multi-View Reconstruction by Learning from Silhouettes
Olivia Wiles and Andrew Zisserman
Abstract
The objective of this paper is 3D shape understanding from single and multiple images. To this end, we introduce a new deep-learning architecture and loss function, SilNet, that can handle multiple views in an order-agnostic manner. The architecture is fully
convolutional, and for training we use a proxy task of silhouette prediction, rather than
directly learning a mapping from 2D images to 3D shape as has been the target in most
recent work.
We demonstrate that with the SilNet architecture there is generalisation over the number of views – for example, SilNet trained on 2 views can be used with 3 or 4 views at
test-time; and performance improves with more views.
We introduce two new synthetics datasets: a blobby object dataset useful for pretraining, and a challenging and realistic sculpture dataset; and demonstrate on these
datasets that SilNet has indeed learnt 3D shape.
Session
Spotlights
Files
Paper (PDF)
Supplementary (PDF)
Video (MP4)
DOI
10.5244/C.31.99
https://dx.doi.org/10.5244/C.31.99
Citation
Olivia Wiles and Andrew Zisserman. SilNet : Single- and Multi-View Reconstruction by Learning from Silhouettes. In T.K. Kim, S. Zafeiriou, G. Brostow and K. Mikolajczyk, editors, Proceedings of the British Machine Vision Conference (BMVC), pages 99.1-99.13. BMVA Press, September 2017.
Bibtex
@inproceedings{BMVC2017_99,
title={SilNet : Single- and Multi-View Reconstruction by Learning from Silhouettes},
author={Olivia Wiles and Andrew Zisserman},
year={2017},
month={September},
pages={99.1-99.13},
articleno={99},
numpages={13},
booktitle={Proceedings of the British Machine Vision Conference (BMVC)},
publisher={BMVA Press},
editor={Tae-Kyun Kim, Stefanos Zafeiriou, Gabriel Brostow and Krystian Mikolajczyk},
doi={10.5244/C.31.99},
isbn={1-901725-60-X},
url={https://dx.doi.org/10.5244/C.31.99}
}