BMVA 
The British Machine Vision Association and Society for Pattern Recognition 

BibTeX entry

@PHDTHESIS{201207Patrick_Ott,
  AUTHOR={Patrick Ott},
  TITLE={Segmentation Features, Visibility Modeling and Shared Parts
    for Object Detection},
  SCHOOL={University of Leeds},
  MONTH=Jul,
  YEAR=2012,
  URL={http://www.bmva.org/theses/2012/2012-ott.pdf},
}

Abstract

This thesis investigates the problem of object localization in still images and is separated into three individual parts. The first part proposes a new set of feature descriptors, motivated by the problem of pedestrian detection. Sliding window classifiers, notably using the Histogram-of-Gradient (HOG) features proposed by Dalal & Triggs are the state-of-the-art for this task, and we base our method on this approach. We propose a novel feature extraction scheme which computes implicit ‘soft segmentations’ of image regions into foreground/background. The method yields stronger object/background edges than gray-scale gradient alone, suppresses textural and shading variations, and captures local coherence of object appearance. The main contributions of this part are: (i) incorporation of segmentation cues into object detection; (ii) integration with classifier learning c.f. a post-processing filter and (iii) high computational efficiency. The second part of the thesis considers deformable part-based models (DPM) as proposed by Felzenszwalb et al. These models have demonstrated state-of-the-art results in object localization and offer a high degree of learnt invariance by utilizing viewpointdependent mixture components and movable parts in each mixture component. One might hope to increase the accuracy of the DPM by increasing the number of mixture components and parts to give a more faithful model, but limited training data prevents this from being effective. We propose an extension to the DPM which allows for sharing of object part models among multiple mixture components as well as object classes. This results in more compact models and allows training examples to be shared by multiple components, ameliorating the effect of a limited size training set. We (i) reformulate the DPM to incorporate part sharing, and (ii) propose a novel energy function allowing for coupled training of mixture components and object classes. An ‘elephant in the room’ for most current methods is the lack of explicit modeling of partial visibility due to occlusion by other objects or truncation by the image boundary. In the third part of this thesis, we propose a method which explicitly models partial visibility by treating it as a latent variable. As a second contribution we propose a novel nonmaximum suppression scheme which takes into account partial visibility of objects while, in contrast to other methods, providing a globally optimal solution. Our method gives more detailed scene interpretations, in that we are able to identify the visible parts of an object. We evaluate all methods on the PASCAL VOC 2010 dataset. In addition, we report state-of-the-art results on the INRIAPerson pedestrian detection dataset for the first part, considerably exceeding those of the original HOG detector.