BMVA 
The British Machine Vision Association and Society for Pattern Recognition 

BibTeX entry

@PHDTHESIS{200611Georg_Klein,
  AUTHOR={Georg Klein},
  TITLE={Visual Tracking for Augmented Reality},
  SCHOOL={University of Cambridge},
  MONTH=Nov,
  YEAR=2006,
  URL={http://www.bmva.org/theses/2006/2006-klein.pdf},
}

Abstract

In Augmented Reality applications, the real environment is annotated or enhanced with computer-generated graphics. These graphics must be exactly registered to real objects in the scene and this requires AR systems to track a user’s viewpoint. This thesis shows that visual tracking with inexpensive cameras (such as those now often built into mobile computing devices) can be sufficiently robust and accurate for AR applications. Visual tracking has previously been applied to AR, however this has used artificial markers placed in the scene; this is undesirable and this thesis shows that it is no longer necessary. To address the demanding tracking needs of AR, two specific AR formats are considered. Firstly, for a head-mounted display, a markerless tracker which is robust to rapid head motions is presented. This robustness is achieved by combining visual measurements with those of head-worn inertial sensors. A novel sensor fusion approach allows not only pose prediction, but also enables the tracking of video with unprecedented levels of motion blur. Secondly, the tablet PC is proposed as a user-friendly AR medium. For this device, tracking combines inside-out edge tracking with outside-in tracking of tablet-mounted LEDs. Through the external fusion of these complementary sensors, accurate and robust tracking is achieved within a modest computing budget. This allows further visual analysis of the occlusion boundaries between real and virtual objects and a marked improvement in the quality of augmentations. Finally, this thesis shows that not only can tracking be made resilient to motion blur, it can benefit from it. By exploiting the directional nature of motion blur, camera rotations can be extracted from individual blurred frames. The extreme efficiency of the proposed method makes it a viable drop-in replacement for inertial sensors.