BibTeX entry
@PHDTHESIS{200803Pawan_Mudigonda,
AUTHOR={Pawan Mudigonda},
TITLE={Combinatorial and Convex Optimization for Probabilistic
Models in Computer Vision},
SCHOOL={Oxford Brookes University},
MONTH=Mar,
YEAR=2008,
URL={http://www.bmva.org/theses/2008/2008-mudigonda.pdf},
}
Abstract
This thesis investigates the role of optimization in two areas of Computer Science: Computer Vision and Machine Learning. Specifically, we consider two well-known problems in Computer Vision, namely motion segmentation and object category specific image segmentation, and a fundamental problem in Machine Learning, known as maximum a posteriori (MAP) estimation of discrete probabilistic models. In order to address the problem of motion segmentation, we propose a novel probabilistic model which is suitable for this application. Our model includes the effects of occlusion, lighting changes and motion blur. The segmentation of a given video is obtained by performing efficient inference on the probabilistic model in two stages: (i) In the first stage, an initial estimate of the model is obtained using a novel coarse-to-fine technique which reduces the time and memory required by the sum-product belief propagation algorithm; and (ii) In the second stage, the initial estimate is refined using the observation that the energy of the model can be efficiently reduced using the α − β-swap and α-expansion algorithms. For object category specific image segmentation, we extend the probabilistic model used in previous approaches. Specifically, we incorporate it with an object category model which provides top-down information about the shape of the object. Given an image, its segmentation is determined using two new algorithmic contributions: (i) We propose efficient methods for obtaining samples of the object category models of our choice by matching them to the given image; and (ii) We make the (not obvious) observation that these samples can be quickly marginalized within the EM framework using one st-mincut operation. We compare our method with the state of the art approaches and demonstrate significant improvement. Next, we present a theoretical analysis of previously proposed algorithms for MAP estimation which are based on convex relaxations. In particular, we show that a widely used linear programming (LP) relaxation strictly dominates (i.e. provides a better approximation than) some recently proposed Quadratic Programming (QP) and Second Order Cone Programming (SOCP) relaxations. We generalize this result to show that, despite the flexibility in the form of objective function and constraints offered by QP and SOCP, the LP relaxation dominates a large class of QP and SOCP relaxations. As a consequence of our analysis, we obtain two new SOCP relaxations which dominate the previous approaches. Finally, we consider the problem of efficiently solving the new SOCP relaxations. To this end, we build upon the tree reweighted message passing framework. We propose convergent algorithms which iteratively optimize the Lagrangian dual of the SOCP relaxations. These algorithms allow us to empirically verify our theoretical analysis using both synthetic and real data.