Next: 5 Gesture classification Up: Using Hidden Markov Models Previous: 3 Modeling gestures with

4 Learning gesture models

Automatic gesture recognition consists, at this point, in finding the model that best fits a given image sequence. This implies estimating the following parameter set: the transition matrix A , the posture models collection C , the variance matrix and the state dimension N . It is also necessary to estimate temporal informations like the number of self-transitions of each state, and the order of the transitions between canonical postures.

4.1 The EM algorithm

The identification procedure is based on the Expectation-Maximization (EM) algorithm [ 1 ]. It computes the update of the model parameters and it estimates some auxiliary quantities, such as the number of jumps from state r to state s up to time k and the occupation time of state r up to time k .

The convergence of the EM algorithm is guaranteed by Jensen inequality [ 1 ]. The generated sequence of the estimates of the parameters correspond to nondecreasing values of an appropriate likehood function. The learning process can, then, be terminated when the likehood either reaches a certain threshold level or does not increase any more.

4.2 Experimental results

Let's consider the simple example of a hand gesture shown in Figure (6). It consists of repeated openings and closures of the hand.

Figure 6: An example of a simple hand gesture. The sequence of the observation columns vectors (each with 49 elements) can be arranged into a feature matrix as pictorially represented in Figure (7). Figure (8) is, instead, a pictorial representation of the trajectory of the posture model of the current estimated state with N =4; this is the reconstruction of the gesture.

Figure 7: Pictorial description of the sequence of observations. Time k is on the x -axis, while each column is the observation vector , the gray levels are proportional to the values of the components of the observation vectors.

Figure 8: Reconstruction of the observation sequence using the estimated states with N =4. The 4-states HMM model generated by the learning algorithm is shown in Figure (9).

Next: 5 Gesture classification Up: Using Hidden Markov Models Previous: 3 Modeling gestures with

Adrian F Clark
Mon Jul 28 12:54:58 BST 1997