Can AI Teach Itself to Improve Our Well-Being?
Published on Aug 9, 2022
By Alan Cowen, CEO & Chief Scientist
The future of technology hinges on the measurement of human well-being. Simply put, this is why Hume AI, and our companion nonprofit, The Hume Initiative, exist.
It’s a mission reflected in the headline on our home page: “The expressive communication platform for researchers and developers: APIs, ML models, and globally diverse data to align science and technology with human well-being.” It’s front and center on The Hume Initiative home page, as well:
"The Hume Initiative is a nonprofit effort charting an ethical path for empathic AI… Empathy is essential to ensuring that algorithms are optimized for our well-being."
The importance of human well-being really can’t be overstated. A recent episode of The Feelings Lab podcast delves into the topic with my longtime friend and mentor, Dr. Dacher Keltner — it’s a really fun and insightful conversation that’s a great entry point to understanding the work Hume AI is doing. It’s also a great introduction to the science at the core of what we do: The science of expressive communication.
We need to know how people express their feelings to examine whether technology is making us feel better or worse. That forms a direct link between expressive communication and the promise of artificial intelligence that can teach itself to make the world a better, happier place. Unfortunately, there’s a big gap between the current state of the art in expressive communication and the way those learnings are being applied to the AI-powered systems so many of us interact with every day.
The Science of Expressive Communication
Emotion science, sometimes known as affective science, is an academic discipline that sits at the intersection of neuroscience and behavioral science. The discipline studies how emotions are elicited and experienced, as well as how we recognize them in others. A related field known as affective computing is focused on training machine learning models to recognize emotional behaviors – technology companies often refer to the discipline as “Emotion AI”. It’s a clunky term at best (using one noun to describe a second noun isn’t great!). It’s also misleading. It sounds like people are claiming AI can detect emotions, which seems scary and impossible at the same time, but the discipline is really about obtaining objective measurements of expressive behaviors and physiological responses. The term “Emotion AI” also suggests that some AI systems recognize emotion-related behaviors while others don’t. The truth is, neural networks that are fed with enough data are already teaching themselves to recognize these things, but are doing so in ways that give humans limited insight or control over what they’re learning and how they respond.
Clearly, the term “emotion AI” has generated a lot of confusion over what thousands of emotion science and affective computing researchers are seeking to accomplish. Given recent advances in big data, artificial intelligence, and statistical methods, these two fields are converging and yielding important insights and potential applications.
Consider how much of your day is actually spent interacting with algorithms, from customer service bots to social media apps to your favorite music or streaming video platform’s recommendation engine. Each of these algorithms is trained to hone in on an objective, like getting you to spend more time on an app, that may or not be aligned with your emotional well-being at any given time. What might your days look like if those algorithms were instead trained with the objective of improving your well-being? Could empathic AI models actually ensure your interactions with technology — all of our interactions with technology — make us feel happier in the long run?
Emotion scientists are not only deepening our understanding of how people express their feelings and well-being, but also working with AI practitioners to translate this knowledge into machine learning models. These models aim to give computers “expressive communication” capabilities.
For example:
• Emotionally intelligent chatbots and digital assistants that understand what you’re trying to express and respond appropriately. Everything we say has a tone to it. People say things in an exasperated tone, hoping the bot will take the cue to learn from its mistakes (it currently doesn’t), they express urgency when they want a quicker response, and so on.
• Sentiment analysis tools geared towards health and well-being. Imagine a social media or content feed algorithm that correctly sees people’s anger as sign that something is wrong and not as a positive sign of engagement.
• Realistic AI avatars in virtual reality and metaverse applications, optimized to ensure you’re having a good time, or for specific tasks like performing therapy.
Clearly, there’s a huge opportunity to better align the technologies millions of us use everyday with human well-being. But there’s much work to be done.
Advancing the Field
First, the current public conversation does not reflect recent advances in the field.
This is not entirely surprising, considering how much the field has changed in just the past decade. Real human expression involves complex patterns of vocal signaling, facial-bodily movements, and emotional language, and it turns out that it’s hard to make much sense of all this data without computer programming. Early emotion science research relied on oversimplified assumptions, like a 1970s theory that there were only six “basic” human emotions with six corresponding facial expressions.
We have over 40 facial muscles — do they really exist in just six configurations? On top of that, we humans express so much with our vocal intonations and body language; how could science actually dictate that our expressions live only on our faces?
In fact, current research suggests that vocal expressions alone, including laughs and sighs, can convey at least 28 different meanings. The exact number isn’t important; what matters is that humans do, in fact, exhibit a much broader and more nuanced range of emotional expressions.
Early affective computing models were built on the six basic emotions assumption. They were also trained on small, homogenous datasets. By that I mean they relied mostly on posed pictures of white, able-bodied, young, American undergraduates. As a result, they were often coarse, incomplete, lacking psychological validity, and did not reflect key cultural and demographic nuances.
Taken together, this means that, until recently, affective computing models, including more sophisticated AI-enabled models, were largely Garbage-In, Garbage-Out. That’s a computer science and mathematics term used to describe how if you feed bad data into a computer program, you’re going to get a bad result as output.
Finally, public conversation feeds off distrust in AI systems. It conflates facial recognition surveillance technology and claims of “emotion detection” with computers that understand what we’re trying to express, and reveals deep-seated fears that harnessing the power of emotion science and affective computing will exacerbate the spread of misinformation, political polarization, and increase mental health issues in both young people and adults.
This is understandable, considering that the tech industry has traditionally embraced a business model that prioritizes engagement at any cost. But it is only a part of the story.
The Next Wave in Empathic Computing
In 2022, the field has come into its own, and not a day too soon.
Anyone reading this can likely speak to the power technology can wield over our collective emotions. One quick browse through social media or a news feed can really shake up a person’s emotional state.
This has always felt somehow inevitable, but stop to consider how much of that is the result of deliberate algorithmic design choices? And why haven’t we trained these algorithms to care about how they make us feel? Is it acceptable for a machine to purposely elicit negative emotions as a means to an end? What might our days look like if those algorithms were instead trained to recognize our emotional behaviors as signals that something went right or wrong, responded in a manner consistent with what we were trying to express, and became better at making us happy over time?
We believe this is possible. We can rigorously study how AI-enabled algorithms can be optimized for the well-being of diverse populations. And we can leverage those findings to build systems, networks, and digital communities that prioritize human thriving, we can build a happier, more connected society.
That’s what Hume AI, and The Hume Initiative, are all about.
Our team of scientists, researchers, and engineers are working to enable this shift by building the first expressive communication platform that captures a nuanced, sophisticated, and accurate representation of how humans really express themselves.
Join us in our efforts by signing up for the platform waitlist, digging into our models and datasets, and following us here on the blog. We’ll be sharing more about our work in the coming weeks and months, including updates on emotion science, our empathic AI platform, and the kinds of real-world use cases and solutions that emerge from AI that can teach itself to improve our well-being.
Subscribe
Sign up now to get notified of any updates or new articles.
Recent articles
00/00
We’re introducing Voice Control, a novel interpretability-based method that brings precise control to AI voice customization without the risks of voice cloning. Our tool gives developers control over 10 voice dimensions, labeled “masculine/feminine,” “assertiveness,” “buoyancy,” “confidence,” “enthusiasm,” “nasality,” “relaxedness,” “smoothness,” “tepidity,” and “tightness.” Unlike prompt-based approaches, Voice Control enables continuous adjustments along these dimensions, allowing for precise control and making voice modifications reproducible across sessions.
Hume AI creates emotionally intelligent voice interactions with Claude
Hume AI trained its speech-language foundation model to verbalize Claude responses, powering natural, empathic voice conversations that help developers build trust with users in healthcare, customer service, and consumer applications.
How EverFriends.ai uses empathic AI for eldercare
To truly connect with users and provide a natural, empathic experience, EverFriends.ai needed an AI solution capable of understanding and responding to emotional cues. They found their answer in Hume's Empathic Voice Interface (EVI). EVI merges generative language and voice into a single model trained specifically for emotional intelligence, enabling it to emphasize the right words, laugh or sigh at appropriate times, and much more, guided by language prompting to suit any particular use case.