Emotion-Aware Multi-View Contrastive Learning for Facial Emotion Recognition

Daeha Kim, Byung Cheol Song ;


"When a person recognizes another’s emotion, he or she recognizes the (facial) features associated with emotional expression. So, for a machine to recognize facial emotion(s), the features related to emotional expression must be represented and described properly. However, prior arts based on label supervision not only failed to explicitly capture features related to emotional expression, but also were not interested in learning emotional representations. This paper proposes a novel approach to generate features related to emotional expression through feature transformation and to use them for emotional representation learning. Specifically, the contrast between the generated features and overall facial features is quantified through contrastive representation learning, and then facial emotions are recognized based on understanding of angle and intensity that describe the emotional representation in the polar coordinate, i.e., the Arousal-Valence space. Experimental results show that the proposed method improves the PCC/CCC performance by more than 10% compared to the runner-up method in the wild datasets and is also qualitatively better in terms of neural activation map."

Related Material

[pdf] [supplementary material] [DOI]