Next: 3 Experiments Up: Fast Face Localisation and Previous: 1 Introduction

2 Optimised Robust Correlation

The objective of the optimised robust correlation is to find the global extremum in a multi-dimensional search space that corresponds to the best match between a pair of images. This search space is defined by the set of all valid geometric and photometric transformations. In our implementation of the proposed method the geometric transformations are translation, scaling and rotation . Given a point in the multi-dimensional search space, a combined score function is evaluated. This function and some of its properties are described in Section 2.1 . To find the global optimum of the score function a search technique based on random exponential perturbations is employed. This optimisation method, which was shown to be particularly suitable for the given search problem (see Section 3 ), is discussed in Section 2.2 . Finally in Section 2.3 , a quasi-random sampling technique used for estimating the value of the score function is outlined.

2.1 Score Function

Given a transformation t , a match score s is computed as a weighted sum of two scores, an area score and a grey level score :

where denotes a constant in the interval [0,1] . The area score is included to encourage large overlaps and is defined as

where and denote the sampling sets corresponding to the reference and test images, and , respectively. The derivation of this function is based on probabilistic arguments [ 8 ]. The grey level score, which measures the similarity between the intensity distributions, is defined as

where denotes the robust kernel, the intensity transformation, the projection function and the maximum response of the robust kernel. The kernel function used in the experiments reported in Section 3 is a simple quadratic function and is defined as

where and denote the compared grey levels and the cut-off distance. The latter defines the width of the kernel and grey level differences greater than this constant will not contribute to the final score. The intensity transformation implements a linear mapping between grey levels in the reference and test images. It is defined as

where g denotes a grey level. Pixels are projected from the reference image to the test image using an affine projection function where p denotes the pixel to be projected and the result of the projection. The horizontal and vertical coordinates of the projection are defined as

where denotes the centre of gravity computed from the sampling set.

2.2 Optimisation Method

The search technique we employ is based on random exponential perturbations (see Algorithm 1 ). In each iteration, the transformation between reference and test image is perturbed by adding a random vector drawn from an exponential distribution. The new transformation is accepted only if the score was increased.

The approach described above is similar to simulated annealing [ 10 ] at zero temperature. Successful applications of simulated annealing within the areas of object detection and recognition have been reported in [ 9 ] and [ 1 ].

2.3 Random Sampling

If the object under consideration can be adequately represented using only a fraction of the pixels then this is clearly advantageous since the execution time of the method is directly dependent on the number of samples used. In our application this is certainly the case since an image of a face contains a high degree of redundant information, and large regions can often be represented with only a few points.

A sampling technique commonly used in Monte Carlo integration is based on Sobol sequences [ 15 ]. A Sobol sequence is a quasi-random sequence of numbers maximally spread out over a given hyper-cube. The sequence is generated number-theoretically, rather than randomly, and successive points at any stage fill in the gaps in the previously generated distribution. The use of Sobol sequences leads to faster convergence compared to uniformly distributed random numbers since the fractional error of the approximation decreases as instead of [ 15 ], where N is the number of samples and d the dimensionality of the approximated function. In Figure 1 , three Sobol sequences are shown.

Figure 1: Two-dimensional Sobol sequences at three different sampling rates.

Combining the optimised robust correlation method with random sampling yields several benefits. The quasi-random nature of the process implies that the sampling points will not interfere with (and possibly cancel out) a specific frequency. In contrast, if sampling points are positioned on a grid, aliasing is much more likely. Furthermore, it is possible to continuously increase the sampling density -- until some convergence criterion is met -- and at the same time maintain approximately uniform density throughout the image. This is because the sampling points are avoiding the chance clustering that occurs with random points drawn from a uniform distribution. A direct consequence of this property is that Sobol sequences can be used for continuous multi-resolution matching. By varying the sampling density the image can be represented in different resolutions.

Next: 3 Experiments Up: Fast Face Localisation and Previous: 1 Introduction

Kenneth Jonsson
Mon Jul 14 14:56:41 BST 1997