Next: 4 Learning gesture models Up: Using Hidden Markov Models Previous: 2 Representation of shape

3 Modeling gestures with HMM

3.1 State description

A fundamental hypothesis of the proposed technique for automatic hand gesture recognition is that gestures may be modeled as sequences of a finite number of ``canonical" postures of the hand. Each posture is associated to a state of a probabilistic finite state machine, in particular of a Hidden Markov Model. Each gesture is identified with a HMM with an appropriate number of states and transition probabilities. The recognition problem becomes that of estimating the number of states and identifying the parameters of the model from the observations sequence. The time trajectory of the state estimates describe the estimated gesture.

3.2 Continous-range observations HMM and posture models

The equations of a HMM with discrete states and continuous range observations are

in which is a finite-state Markov chain and are vector real-valued observations. The components of are the means of the size functions. is a sequence of N(0,1) independent and identically distributed random variables. The model is specified by the transition probability matrix A and by the matrices C and of appropriate dimensions. Without loss of generality the N states can be identified with the set of the unit vectors in .

The observation equations are the sum of a component directly determined by the state and a gaussian noise term with variance determined by the elements of . When the first term of this sum becomes exactly the j -th column of C , hence this column is a symbol of the correspondent state in the observation space. In other words, this quantity is the mathematical description of what we called the canonical posture associated with the considered state. It is a posture model.

Well known algorithms [ 1 ] generate the sequence of estimates of the states by measuring at any time k the probabilistic distance between the current observation and each posture model, according to the following expression

where is, except for a scale factor, the gaussian density of the current observation given the i -th posture model and is the i -th column of A .

3.3 Estimating the number of states

According to our hypotesis, we expect clustering of the observations around the posture models. This is confirmed in real image sequences as it can be seen in Figure (5).

Figure 5: (Left) Two components of the observation vectors are plotted against each other. Clustering of the observations around posture models is clear. (Right) Modal distribution of one component of the observation vector, peaks are used to initialize the estimates of the posture models. The correct number of states of the HMM is related to the number of peaks in the modal distribution of the values of each component of the observation vector

, see Figure (5).

Next: 4 Learning gesture models Up: Using Hidden Markov Models Previous: 2 Representation of shape

Adrian F Clark
Mon Jul 28 12:54:58 BST 1997