Next: 6 Conclusions Up: Spatio-Temporal Approaches to Computation Previous: 4 Feed Forward Estimation

5 Spatio-Temporal Motion Models

Previous motion estimation techniques have estimated motion using models encoding the motion between a single pair of consecutive images[ 1 , 2 , 4 , 6 ]. The method presented here uses a spatio-temporal motion model with explicit temporal motion model with explicit temporal components extracted from several consecutive images (or several non-consecutive key frames - see figure 6 ). This formulation essentially models curved trajectories through time [ 5 ] which is not possible with the affine method or block matching alone. Such an approach means that, given one of the images in a sequence, all the others in the sequence that contributed to recovering the motion parameters may be reconstructed. Thus motion following curved trajectories may be modelled by a single set of motion parameters.  

One possible model for displacement which includes temporal components is the following spatio-temporal affine model

where is the time interval from the current image to some arbitrary time , and are the motion parameters.

A major advantage of spatio-temporal motion model is its ability to accurately describe longer sequences of images with motion parameters generated from a few key frames. Unlike block matching, curved trajectories can be captured. This can be demonstrated by comparing the motion of two hand-segmented image features for the original and predicted frames of a sequence.

Shoulder and mouth feature points are located by hand in seven PAL sized frames from the Cathy sequence - figures 5 (b) and (d) respectively. The trajectories of these image features are plotted in figures 5 (a) and (c). Notice how similar the trajectories in the predicted sequence are to those in the original sequence. Note also that the predicted frames have been generated from a single set of motion parameters for the entire seven-frame sequence.

   
Figure 5: Recovering Curved Motions

Applications to Film Effects and Video Compression

Since frame display rates vary among media such as film and video, transferral of image sequences from one medium to another usually necessitates changing the number of frames by generating new images or removing existing frames. Some simplistic methods of frame interpolation include inserting copies of existing frames in the sequence; and exploiting the interlaced nature of some formats to combine alternate lines from two consecutive images. The former technique is prone to static moments , in which rapidly moving objects appear stationary for a brief instant, while in the latter, when visual motions are large, the two sets of interlaced lines may separate partially or completely. Using the spatio-temporal motion model described here, optical flow estimates of visual motions may be recovered which permit frames to be interpolated for any real value of . Other special effects made possible by frame interpolation include varispeeding (arbitrary speeding and slowing of action), motion blur , and motion keying (separating of foreground objects and background).

It has been shown that the spatio-temporal approach generates more accurate motion estimates than block matching alone enabling accurate image prediction. Additionally, many frames may be predicted from a single set of motion parameters as illustrated in figure 6 . Video compression methods such as MPEG and ITU-T H.261 transmit differences between the original images of a sequence and the images predicted from motion data (produced using block matching). If motion estimates are more accurate, less data is required to be transmitted for two reasons: first, these difference images may be compressed by a greater amount, and second, as longer sequences of images can be reconstructed from one set of motion data, the number of original images that need to be transmitted is reduced. As greater transmission rates are possible, higher quality or large images may be transmitted with no additional time overhead if motion estimates are generated offline e.g. video on demand.

   
Figure 6: Motion Computation and Prediction



Next: 6 Conclusions Up: Spatio-Temporal Approaches to Computation Previous: 4 Feed Forward Estimation

Graeme Jones
Thu Jul 17 12:40:38 BST 1997