The need to combine pictures into panoramic mosaics existed since the beginning of photography, as the camera's field of view is always smaller than the human field of view. Also, many times large objects could not be captured in a single picture as is the case in aerial photography. Using a wide field of view (fish-eye) lens can be a partial solution, but the images obtained with such a lens have substantial distortions, and capturing an entire scene with the limited resolution of a video camera compromises image quality. A more common solution is photo-mosaicing: aligning, and pasting, frames in a video sequence, which enables a more complete view. Digital photography enabled new implementations for mosaicing [ 13 , 6 , 15 ], which were first applied to aerial and satellite images, and later used for scene and object representation.
The simplest mosaics are created from a set of images whose mutual displacements are pure image-plane translations. This is approximately the case with some satellite images. Such translations can either be computed by manually pointing to corresponding points, or by image correlation methods. Other simple mosaics are created by rotating the camera around its optical center using a special device, and creating a panoramic image which represents the projection of the scene onto a cylinder [ 3 , 10 , 9 , 11 ]. But the limitations to motion which is a pure rotation about the optical center limits the applicability of this approach.
In more general camera motions, that may include both camera translations and small camera rotations, more general transformation for image alignment are used [ 1 , 8 , 13 , 6 ]. In these cases images are aligned pairwise, using a parametric transformation like an affine transformation or a planar-projective transformation. A reference frame is selected, and all images are aligned with this reference frame and combined to create the panoramic mosaic. Significant distortions are created when camera motion includes substantial pan or tilt.
To overcome most restrictions on mosaicing a new mosaicing methodology is presented, where images in a video sequence are transformed such that the optical flow between frames becomes parallel, and they can be easily mosaiced. The mosaic generated this way includes almost all details observed by the moving camera, where each region is taken from that image where it captured at highest resolution.
A practical implementation of general mosaicing can be done by a process of collecting strips from image sequences satisfying the following conditions:
Under these properties, generated mosaics have minimal distortions compared to the original images, as no global scaling is performed. The strip collection process also allows the introduction of a mechanism to overcome the effects of parallax by generating dense intermediate views.
Adrian F Clark