BMVA 
The British Machine Vision Association and Society for Pattern Recognition 

BibTeX entry

@PHDTHESIS{200608Alessio_Del_Bue,
  AUTHOR={Alessio Del Bue},
  TITLE={Deformable 3-D Modelling from Uncalibrated Video Sequences},
  SCHOOL={University of London},
  MONTH=Aug,
  YEAR=2006,
  URL={http://www.bmva.org/theses/2006/2006-delbue.pdf},
}

Abstract

The rigidity of a scene observed by a camera is often the fundamental assumption used to infer 3-D information automatically from the images taken by that camera. However, a video sequence of a natural scene often contains objects that modify their topology (for instance, a smiling face or a beating heart) thus violating the rigidity assumption necessary to reconstruct the 3-D structure of the object. In this thesis, we address the challenging problem of recovering the 3-D model of a deforming object and the motion of the camera observing it purely from image sequences, when nothing is known in advance about the observed object, the internal parameters of the camera or its motion. Previous solutions to this non-rigid structure from motion problem have either provided approximate solutions using linear approaches to a problem that is intrinsically non-linear or required strong assumptions about the nature of the 3-D deformations. In this thesis, we propose a non-linear framework based on bundle adjustment to estimate model and camera parameters. We then upgrade the proposed framework to deal with the case of a stereo camera setup. We show that when the deforming object is not performing a significant overall rigid motion a monocular approach leads to poor reconstructions, and only by fusing the information from both cameras can the correct 3-D shape be extracted. However, the problem of 3-D reconstruction of deformable objects is still fundamentally ambiguous: given a specific camera motion, different non-rigid shapes can be found that fit the observed 2-D image data. In order to reduce this effect, we introduce shape priors based on the observation that often not all the points on a deforming object are moving non-rigidly but some tend to lie on rigid parts of the structure. First, we propose motion segmentation algorithms to divide the scene automatically into the rigid and non-rigid point sets. Secondly, we use this information to provide priors on the degree of deformability of each point. Crucially all the above methods only work under the assumption of orthographic viewing conditions. Perhaps the most valuable contribution of this thesis is to provide a new algorithm to obtain metric reconstructions of deformable objects observed by a perspective camera.