Publication in Frontiers in Psychology: Insights from a Large-Scale Study on the Meanings of Facial Expressions Across Cultures
By Jeffrey Brooks on Jun 28, 2024
Understanding how emotions are experienced and expressed across different cultures has long been a central focus of debate and study in psychology, cognitive science, and anthropology. What emotions do people in different cultures experience in response to the same evocative scenes and scenarios? What facial movements do they produce? How are feelings and expressions related?
We explored this intricate relationship by examining facial expressions and self-reported emotional experiences across North America, Europe, and Japan. Utilizing advanced computational methods and large-scale data collection, this study offers new insights into the universality and cultural specificity of emotional expression. It was recently published in the journal Frontiers in Psychology, in a paper titled “How emotion is experienced and expressed in multiple cultures: a large-scale experiment across North America, Europe, and Japan”.
Data Collection
To understand the relationship between emotional experiences and facial expressions across cultures, we collected a corpus of 45,231 recordings of participants worldwide reacting to 2,185 validated, evocative short videos and reporting on their emotional experiences.
We recruited participants through platforms such as Amazon Mechanical Turk and Prolific for English speakers, and Crowdworks for Japanese speakers. Each participant viewed a series of randomly selected videos and recorded their reactions using a webcam. The videos, sourced from previous studies, covered a broad range of emotionally evocative content, including scenes of nature, births, deaths, accidents, practical jokes, and more. Participants were instructed to react expressively and provide self-reports of their emotional experiences using a set of 34 emotion categories, each rated on a 1-100 intensity scale. Additionally, they rated the valence (positive or negative quality) and arousal (intensity of the emotion) of their experiences on bipolar 1-9 Likert scales.
Analyzing Facial Expressions
We characterized participants’ facial expressions using both manual annotations from independent raters, and deep neural networks (DNNs). Manual annotations involved human raters.
For the manual annotations, we collected ratings of the reaction videos from an additional set of 3,293 participants. Raters were asked to categorize each 2-second segment of the reaction videos into 42 emotion categories, describing the emotions felt by the person in the video by selecting all that applied. Annotations were averaged across each video and were conducted in complete blindness to the nature of the eliciting video that the person was reacting to.
DNN annotations were derived from a facial expression model and a Facial Action Coding System (FACS) model. The facial expression model predicted the likelihood of a given facial expression (e.g. “amusement”, “sadness”) being present. The FACS model predicted the presence of 31 specific facial action units (AUs) and other face-relevant actions based on the activity of facial movements.
Schematic of the experimental and analytic approach. Figure reproduced courtesy of Frontiers in Psychology
Results
Dimensions of emotion experience
We used principal preserved components analysis (PPCA) to extract the dimensions of emotional experience that were reliably preserved across different cultural groups. This approach revealed at least 21 distinct dimensions of subjective emotion experience, which were consistent across English and Japanese responses, including nuanced emotions such as "aesthetic appreciation," "amusement," "anxiety," "romance," and "triumph."
Predictive Models
The study developed predictive models to examine how well facial expressions could predict self-reported emotional experiences. These models were trained using linear regression on the annotations of facial expressions (manual and DNN) and the corresponding self-reports. The analysis revealed that, at the individual level, facial expressions modestly predicted the average emotional experience evoked by each video. However, when aggregated across multiple participants, the predictive accuracy increased significantly.
Facial expressions were found to predict at least 12 dimensions of emotional experience with significant accuracy. This suggests that, despite individual variability, there are consistent patterns in how emotions are expressed through facial movements which are largely preserved across different cultural contexts.
What expressions reveal. A) Accuracy of reported emotional experience prediction from expression within and across cultures. B) Degree to which average experience and expression are captured by varying sample sizes. C) Prediction correlations for each of 21 dimensions of experience. Data were combined across all countries to precisely estimate correlations. D) 12 distinct dimensions of emotional experience were predicted with significant accuracy from facial expression. E) Distribution of 12 experience/expression dimensions across videos. F) Example expressions yielding accurate emotion predictions. Figure reproduced courtesy of Frontiers in Psychology.
Cross-Cultural Comparisons
To compare emotional experiences across cultures, we computed correlations between the ratings of emotion categories and valence/arousal dimensions from different cultural groups, adjusting for within-culture variation. We found high correlations for many specific emotions, indicating substantial cross-cultural similarities in how emotions are experienced.
However, for facial expressions there were notable differences in intensity. Expressions of "amusement," "disgust," and "joy" were more pronounced in North America and Europe than in Japan. These differences highlight the role of cultural display tendencies in shaping how emotions are expressed.
This result was also shown in our analysis of the outputs from the FACS DNN. The underlying facial actions associated with each dimension of emotional experience show high agreement between Japan and the United States (the 36 × 12 correlation matrices shown below for the U.S. and Japan were correlated with an r of 0.84), with differences mainly emerging in the intensity of specific AUs. In general, the same AUs were associated with each dimension in the U.S. in Japan, with the U.S. showing higher intensity on average.
Correlation matrices showing the association between facial action units and each of the 12 dimensions of facial expression uncovered in the study. Each matrix contrasts each of the 12 dimensions in their association with the 36 AUs measured by the FACS DNN in the United States in Japan with the original set of AUs proposed by Ekman to comprise the facial display for a particular kind of emotion.
Conclusion
By analyzing thousands of naturalistic reactions to diverse emotion antecedents, here we paint a detailed portrait of how people in different cultures express emotion. In response to 2,185 evocative videos, people in diverse cultures report at least 21 distinct varieties of emotional experience, which are best conceptualized using specific emotion categories. Average experiences can be predicted with remarkable accuracy from expressive responses to each video at the aggregate level. Facial movements largely have common meaning across multiple cultures, but are subject to differing display tendencies, with many facial expressions in Japan being more nuanced than those in North America and Europe. Expressions also show high individual variability, which partly reflects individual differences in emotional experiences.
Our scientific understanding of the meaning of expressions has been limited by methods and theories which prioritize a narrow focus on whether six emotions map to prototypical facial movements. Open-ended methods and large-scale evidence paint a more comprehensive picture – a detailed portrait of the complex ways in which people move their faces in response to thousands of emotionally evocative scenes. Across diverse cultural groups in North America, Europe, and Japan, we find that facial expressions reflect a broad array of specific feelings, are similar in meaning, and are subject to varying display tendencies.
Subscribe
Sign up now to get notified of any updates or new articles.
Recent articles
Introducing OCTAVE (Omni-Capable Text and Voice Engine)
A frontier speech-language model with new emergent capabilities, like on-the-fly voice and personality creation.
We’re introducing Voice Control, a novel interpretability-based method that brings precise control to AI voice customization without the risks of voice cloning. Our tool gives developers control over 10 voice dimensions, labeled “masculine/feminine,” “assertiveness,” “buoyancy,” “confidence,” “enthusiasm,” “nasality,” “relaxedness,” “smoothness,” “tepidity,” and “tightness.” Unlike prompt-based approaches, Voice Control enables continuous adjustments along these dimensions, allowing for precise control and making voice modifications reproducible across sessions.
Hume AI creates emotionally intelligent voice interactions with Claude
Hume AI trained its speech-language foundation model to verbalize Claude responses, powering natural, empathic voice conversations that help developers build trust with users in healthcare, customer service, and consumer applications.