The Cartesian differential invariants defined above are invariant to
position and orientation, but depend on the choice of a scale parameter
. In order to capture image structure at appropriate scales and deal
with image magnification it is essential to compute the invariants over
a range of scales.
We can construct a vector, v , that represents the local image structure around a given image point over a range of 2 t +1 scales using:
Where
is the geometric mean scale,
,
is the base of a logarithmic series which determines the scales to be
sampled and
is the response of the
j
th invariant filter at scale
s
.
We can use the vector of invariants for a chosen point in one image to
locate similar points in a second image. For every point (
x
,
y
) in the second image we generate several vectors,
, one at each of a range of base scales,
. We then compare each in turn with the vector for the original point.
Two vectors
and
are compared using the metric:
where
S
is the covariance matrix of the distribution of vectors of invariants in
the first image (
is a Mahalanobis distance). A similarity image can then be constructed
by selecting the similarity value of the best matching scale for each
pixel (the smaller
, the greater the similarity) . This shows the similarity with respect
to the spatial position. The final matches can then be found by
selecting a set of the lowest troughs in the similarity image. An
example is shown in Figures
2
,
3
and
5
. In Figure
2
we attempt to find similar points to point A from Figure
5
. Figure
3
shows the similarity image calculated during the search; the peaks of
this image are superimposed on the search image (bright areas are those
most similar to A). The size of the superimposed points indicate the
scale of the match. Note that the algorithm has detected both the
correct position and the correct scale for matches with the eyes.
Figure:
The similarity image corresponding to Figure
2
.
Figure:
The results of the search for point A in Figure
5
.
A problem exists with the search algorithm due to size of the scale sampling steps. In the space of all vectors of invariants, each pixel is associated with a scale string whose shape is dependent on how the pixel's differential structure changes with scale. Figure 4 shows an example of such a scale string. At present we sample this scale string on a logarithmic scale. We have shown that the distance between vectors of invariants at neighbouring scales is larger then the average separation between samples from different points. This can cause a mismatch if the target scale falls halfway between the sampled scales. The proposed solution is to interpolate between points in order to form a better approximation of the actual scale string, but is beyond the scope of this paper.
Figure 4:
The two most significant modes of variation of the space of vectors of
invariants, with the scale string of a single pixel highlighted
Kevin Walker