The long-term objective of Professor Kim’s research is to understand human interactions. To this aim, her work focuses on developing data-driven AI approaches that can quantify and analyze nonverbal human behaviors in video recordings, particularly emotion expressions during interactions. The computational systems that she has developed can (1) automatically identify emotion of individuals from audio and visual expressions, (2) provide interpretable descriptions of how emotion changes over time, and (3) discover regions of emotionally salient events in long videos. These systems have successfully analyzed video recordings of two-person interactions, as well as movies. She has also led an ACM award-winning project called “Say Cheese vs. Smile” that presents novel methods for facial emotion recognition when a person is speaking. This research builds upon multimodal (speech and video) signal processing, machine learning, and behavioral science [List of publications].
The first set of studies lay the foundation and central motivation of our research. We discover that it is crucial to model complex non-linear interactions between audio and visual emotion expressions, and that dynamic emotion patterns can be used in emotion recognition.
The understanding of the complex characteristics of emotion from the first set of studies leads us to examine multiple sources of modulation in audio-visual aective behavior. Specically, we focus on how speech modulates facial displays of emotion. We develop a framework that uses speech signals which alter the temporal dynamics of individual facial regions to … Continue reading Multiple Factors in Behavior
We present methods to discover regions of emotionally salient events in a given audio-visual data. We demonstrate that different modalities, such as the upper face, lower face, and speech, express emotion with dierent timings and time scales, varying for each emotion type. We further extend this idea into another aspect of human behavior: human action … Continue reading Localization of Salient Events