BibTeX entry
@PHDTHESIS{201403Ognjen_Rudovic,
AUTHOR={Ognjen Rudovic},
TITLE={Machine Learning Techniques for Automated Analysis of
Facial Expressions},
SCHOOL={Imperial College, London},
MONTH=Mar,
YEAR=2014,
URL={http://www.bmva.org/theses/2014-rudovic.pdf},
}
Abstract
Automated analysis of facial expressions paves the way for numerous next-generation computing tools including affective computing technologies (proactive and affective user interfaces), learner-adaptive tutoring systems, medical and marketing applications, etc. In this thesis, we propose machine learning algorithms that head toward solving two important but largely understudied problems in automated analysis of facial expressions from facial images: pose-invariant facial expression classification, and modelling of dynamics of facial expressions, in terms of their temporal segments and intensity. The methods that we propose for the former represent the pioneering work on pose-invariant facial expression analysis. In these methods, we use our newly introduced models for pose normalization that achieve successful decoupling of head pose and expression in the presence of large out-of-plane head rotations, followed by facial expression classification. This is in contrast to most existing works, which can deal only with small in-plane head rotations. We derive our models for pose normalization using the Gaussian Process (GP) framework for regression and manifold learning. In these, we model the structure encoded in relationships between facial expressions from different poses and also in facial shapes. This results in the models that can successfully perform pose normalization either by warping facial expressions from non-frontal poses to the frontal pose, or by aligning facial expressions from different poses on a common expression manifold. These models solve some of the most important challenges of pose-invariant facial expression classification by being able to generalize to various poses and expressions from a small amount of training data, while also being largely robust to corrupted image features and imbalanced examples of different facial expression categories. We demonstrate this on the task of pose-invariant facial expression classification of six basic emotions.
The methods that we propose for temporal segmentation and intensity estimation of facial expressions represent some of the first attempts in the field to model facial expression dynamics. In these methods, we use the Conditional Random Fields (CRF) framework to define dynamic models that encode the spatio-temporal structure of the expression data, reflected in ordinal and temporal relationships between temporal segments and intensity levels of facial expressions. We also propose several means of addressing the subject variability in the data by simultaneously exploiting various priors, and the effects of heteroscedasticity and context of target facial expressions. The resulting models are the first to address simultaneous classification and temporal segmentation of facial expressions of six basic emotions, and dynamic modelling of intensity of facial expressions of pain. Moreover, the context-sensitive model that we propose for intensity estimation of spontaneously displayed facial expressions of pain and Action Units (AUs), is the first approach in the field that performs context-sensitive modelling of facial expressions in a principled manner.