Next: 2D affine reconstruction Up: Vision for Longitudinal Vehicle Previous: 2 Visual Tracking

3 Fixation and Scene Reconstruction

In [ 15 ] a fixation technique was described that allows a single ``fixation point'' to be chosen from a cluster of tracked features, in a way that is robust to losing track of individual features, while allowing the same object point to be fixated over time. The fixation point was used to drive the (motorized) camera to maintain the view direction at a desired location on the object. The other important idea in [ 15 ] is that reconstructing the shape of a tracked object can be built into the fixation process in a cooperative way, rather than being a separate, higher-level process. The motion and shape of the object was recovered up to a 2D/3D affine transformation from each set of two/three consecutive images, using the measurement matrix factorization algorithm described in [ 18 ]. The motion parameters were then used to transfer a chosen fixation point from the previous image(s) to the latest. This algorithm was extended to stereo cameras in [ 5 ].

While we have a simpler problem, with a fixed camera and a 2D scene representation, similar principles apply. The ``fixation point'' here refers to the point chosen as representative of the vehicle for the purpose of estimating its position, and hence its range. 2D fixation transfer may be performed directly from one image to another, since a 2D coordinate frame may be set up using a single image. We improve on the algorithm of Reid & Murray in that in their version of the algorithm, the 3D structure computed at each time step is discarded. We have argued in [ 12 ] that maintaining scene structure expicitly within a reconstruction algorithm stabilises the computation of motion over time, so a logical extension of the fixation point transfer algorithm is to recursively update the structure of the tracked object, and employ the improved motion estimates to perform fixation point transfer. This is the method we have implemented. The reconstruction technique detailed below may be seen as a version of the Variable State Dimension Filter (VSDF) algorithm [ 12 ], specialized to 2D affine scene reconstruction. The VSDF is a general algorithm for visual reconstruction that deals naturally with fragmentary data and combines data from multiple images in a statistically near-optimal manner.

In comparison with the previous system [ 5 ] we have improved the robustness of the feature tracker to the extent that outliers are much less likely to occur, because the set of feature matches is forced to be globally consistent [ 19 , 10 ]. By integrating the VSDF reconstruction algorithm we now have a stable method of transferring the fixation point over a large sequence of images, and the fixation point transfer mechanism is greatly simplified.

In the remainder of the paper, the theory of 2D affine reconstruction and transfer is briefly outlined in Sections 4 and 5 . The robust stereo feature matcher and the temporal update scheme are described in Section 6 . Finally, off-line results on a stereo sequence of 760 images are presented in Section 7 .

Next: 2D affine reconstruction Up: Vision for Longitudinal Vehicle Previous: 2 Visual Tracking

Adrian F Clark
Thu Jul 10 21:18:54 BST 1997