Next: 2 Visual Tracking Up: Vision for Longitudinal Vehicle Previous: Vision for Longitudinal Vehicle

1 Introduction

The current work may be considered a pilot study for a final longitudinal control system which will combine both laser radar and vision sensors, enabling throttle and brake control to maintain a fixed distance to a lead car. Steering control will be driven by a combination of lane tracking and magnetometers detecting magnetic ``nails'' in the road [ 17 ]. In our system vision can potentially provide higher bandwidth (30/60Hz) output than is available from the laser radar system [ 8 ]. In this paper we compare the outputs of the two sensors, and consider some fundamental questions concerning the use of vision in this context:

Is vision up to the job? Can we measure range effectively using a stereo rig attached to a car, and are the range measurements accurate enough?
How long can we maintain track of a vehicle, and how are the range measurements affected over time?
How can we combine motion and stereo cues to get the most effective range measurements?

Our work draws together recent developments in algorithms for visual tracking, specifically:

The fixation point transfer algorithm of Reid & Murray [ 15 ].
A simplified version of the robust feature matching method of Torr et al. [ 19 ].
The recursive VSDF 3D reconstruction algorithm described in [ 12 ].

Our approach is to combine shape reconstruction (2D planar reconstruction rather than the usual 3D) from stereo/motion with motion estimation, using recently developed robust and efficient feature matching methods. While this may seem overblown when the goal is simply to measure range, we argue that quite sophisticated tools are required to effectively track a vehicle over an extended time, i.e. a period measured in minutes rather than frames.

We shall assume that the viewed vehicle is an affine projection of a planar object. We shall not assume any knowledge of the camera calibration for the purpose of tracking the vehicle. Thus we have possibly the simplest available imaging model. The lead vehicle presents its rear end to the cameras, with little change in orientation, which might otherwise induce the 3D cues that we are ignoring. The depth relief is also small, since only the rear of the car is visible. The smallest range we are considering in our experiments is 10m, so that cars of size will subtend an angle of at most , justifying the assumption of parallel projection from scene to image. The results at the end of this paper will demonstrate that these minimalist assumptions are appropriate.

Next: 2 Visual Tracking Up: Vision for Longitudinal Vehicle Previous: Vision for Longitudinal Vehicle

Adrian F Clark
Thu Jul 10 21:18:54 BST 1997