In this section, experiments using Gaussian mixture models of identity
are described. Face image data were acquired and normalised in a fully
automated way by the face tracking system. The neural network model used
to perform tracking was trained using 9000 example face images rotated
by
and scaled to
and
[
8
]. The normalised faces from the tracker therefore varied by at least
these amounts in scale and rotation. Since the aim of these experiments
was to compare methods for modelling identity rather than to optimise
recognition accuracy, no attempt was made to reduce these variations.
.........
.........
Eight subjects were tracked through relatively unconstrained indoor
scenes as they walked towards a fixed camera. Overhead lighting resulted
in variations in facial illumination. The resolution of the area of the
face tracked ranged from approximately
pixels when the subject was far from the camera to
pixels when the subject approached the camera. Two normalised face
sequences were obtained for each subject. The first sequence of each
subject was used for training and the second sequence for testing. In
total, there were 326 training images and 296 test images. The number of
training images per person varied from 21 to 60 and the number of test
images from 21 to 53. Figure
5
shows 10 of the images used to form the training and test sets three of
the people.
Face space was modelled by performing PCA on the training images. A
specific model was computed from the training set. A generic model was
computed using 644 of the images used to train a face detection neural
network in the tracking system. These images were highly suitable,
having similar variations in scale and rotation to the tracked data to
be recognised. The training images were projected onto the first
n
' eigenvectors and each person's identity was modelled by estimating
either
or
with Gaussian mixtures. The 8 mixture models' parameters were stored
along with the
n
' eigenvectors and eigenvalues and subsequently used to perform
classification of the test sequences.
Initially, both a specific and a generic eigenspace were computed using the first 40 eigenvectors. Table 1 shows a comparison of face classification using the specific and generic models. Identities were modelled by fitting a single radial Gaussian to each person's data. The percentage of images correctly classified for each person along with the percentage of total images classified correctly are given. Sequence classification results are also given based upon a majority vote i.e. the sequence is classified as the person with the most images. The result illustrates the fact that the use of a generic face space which could be used to facilitate identity verification, known/unknown or full recognition, in turn makes face classification more difficult.
Face | Person (% images correct) | Total | Seq. | |||||||
space |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | % | (Maj.) |
Specific | 75 | 64 | 74 | 85 | 56 | 78 | 29 | 11 | 55.1 | 7 |
Generic | 57 | 67 | 66 | 20 | 13 | 72 | 25 | 29 | 43.6 | 4 |
Name | M |
![]() |
![]() |
Tot. | Seq. | |
|
type | % | Maj. | Pr. | ||
T-P | 1 |
![]() |
N | 25.0 | 2 | 2 |
1-NN | n |
![]() |
N | 32.1 | 1 | 1 |
T-P![]() |
1 |
![]() |
Y | 46.3 | 4 | 4 |
Radial | 1 |
![]() |
Y | 44.3 | 4 | 4 |
Diag | 1 |
![]() |
Y | 42.9 | 4 | 3 |
2-Rad | 2 |
![]() |
Y | 52.0 | 5 | 7 |
3-Rad | 3 |
![]() |
Y | 42.2 | 5 | 5 |
2-Diag | 2 |
![]() |
Y | 41.9 | 4 | 5 |
A reduction in the dimensionality of the generic face space from 40 to 20 did not result in any significant loss of accuracy. Face classification results using the 20-dimensional generic space are given in Table 2 . Sequences were classified (1) by a majority vote (Maj.) and (2) by accumulating probabilities (Pr.). Gaussian mixture models of various complexity were compared for modelling identity.
The first two methods in Table
2
used unnormalised pattern vectors. The first method (T-P) used single
radial Gaussians of equal variance resulting in a nearest-mean
classifier which was equivalent to the eigenfaces method of Turk and
Pentland [
10
]. The second method was a nearest neighbour classifier (1-NN). Both
these methods performed poorly. However, the use of normalised pattern
vectors resulted in a significant improvement with T-P
classifying 4 sequences correctly. The mixture models had either radial
or diagonal covariance Gaussians with between 1 and 3 components. A
mixture of 2 radial Gaussians provided the best performance.
Shaogang Gong