The current work may be considered a pilot study for a final longitudinal control system which will combine both laser radar and vision sensors, enabling throttle and brake control to maintain a fixed distance to a lead car. Steering control will be driven by a combination of lane tracking and magnetometers detecting magnetic ``nails'' in the road [ 17 ]. In our system vision can potentially provide higher bandwidth (30/60Hz) output than is available from the laser radar system [ 8 ]. In this paper we compare the outputs of the two sensors, and consider some fundamental questions concerning the use of vision in this context:
Our work draws together recent developments in algorithms for visual tracking, specifically:
Our approach is to combine shape reconstruction (2D planar reconstruction rather than the usual 3D) from stereo/motion with motion estimation, using recently developed robust and efficient feature matching methods. While this may seem overblown when the goal is simply to measure range, we argue that quite sophisticated tools are required to effectively track a vehicle over an extended time, i.e. a period measured in minutes rather than frames.
We shall assume that the viewed vehicle is an affine projection of a planar object. We shall not assume any knowledge of the camera calibration for the purpose of tracking the vehicle. Thus we have possibly the simplest available imaging model. The lead vehicle presents its rear end to the cameras, with little change in orientation, which might otherwise induce the 3D cues that we are ignoring. The depth relief is also small, since only the rear of the car is visible. The smallest range we are considering in our experiments is 10m, so that cars of size will subtend an angle of at most , justifying the assumption of parallel projection from scene to image. The results at the end of this paper will demonstrate that these minimalist assumptions are appropriate.
Adrian F Clark