I work on automatic emotion recognition, a technology that can provide emotional intelligence to AI systems. My work focuses on developing statistical and algorithmic approaches that can quantify and analyze nonverbal human behaviors in multimodal (audio-visual) data, particularly emotion expressions during interactions. This research builds upon multimodal signal processing, machine learning, and behavioral science.
The computational systems that I have developed can (1) automatically identify emotion of individuals from audio and visual expressions, (2) provide interpretable descriptions of how emotion changes over time, and (3) discover regions of emotionally salient events in long videos. These systems have successfully analyzed video recordings of two-person interactions, as well as movies.
This work has been recognized by a best student paper award (ACM Multimedia 2014), SUNY-A Faculty Research Award (2017), and Google Faculty Research Award (2018) [List of publications].
The first set of studies lay the foundation and central motivation of our research. We discover that it is crucial to model complex non-linear interactions between audio and visual emotion expressions, and that dynamic emotion patterns can be used in emotion recognition.
The understanding of the complex characteristics of emotion from the first set of studies leads us to examine multiple sources of modulation in audio-visual aective behavior. Specically, we focus on how speech modulates facial displays of emotion. We develop a framework that uses speech signals which alter the temporal dynamics of individual facial regions to … Continue reading Multiple Factors in Behavior
We present methods to discover regions of emotionally salient events in a given audio-visual data. We demonstrate that different modalities, such as the upper face, lower face, and speech, express emotion with dierent timings and time scales, varying for each emotion type. We further extend this idea into another aspect of human behavior: human action … Continue reading Localization of Salient Events