Are you 80% angry and 2% sad? Why ‘emotional AI’ is fraught with problems

<span> Illustration by Observer Design/Getty/Plume Creative.</span><span>Illustration: Observer Design/Getty/Plume Creative</span>
Illustration by Observer Design/Getty/Plume Creative.Illustration: Observer Design/Getty/Plume Creative

It’s Wednesday evening and I’m at my kitchen table, scowling into my laptop as I pour all the bile I can muster into three little words: “I love you.”

My neighbours might assume I’m engaged in a melodramatic call to an ex-partner, or perhaps some kind of acting exercise, but I’m actually testing the limits of a new demo from Hume, a Manhattan-based startup that claims to have developed “the world’s first voice AI with emotional intelligence”.

“We train a large language model that also understands your tone of voice,” says Hume’s CEO and chief scientist Alan Cowen. “What that enables… is to be able to predict how a given speech utterance or sentence will evoke patterns of emotion.”

In other words, Hume claims to recognise the emotion in our voices (and in another, non-public version, facial expressions) and respond empathically.

Boosted by Open AI’s launch of the new, more “emotive” GPT4o this May, so-called emotional AI is increasingly big business. Hume raised $50m in its second round of funding in March, and the industry’s value has been predicted to reach more than $50bn this year. But Prof Andrew McStay, director of the Emotional AI Lab at Bangor University, suggests such forecasts are meaningless. “Emotion is such a fundamental dimension of human life that if you could understand, gauge and react to emotion in natural ways, that has implications that will far exceed $50bn,” he says.

Possible applications range from better video games and less frustrating helplines to Orwell-worthy surveillance and mass emotional manipulation. But is it really possible for AI to accurately read our emotions, and if some form of this technology is on the way regardless, how should we handle it?

“I appreciate your kind words, I’m here to support you,” Hume’s Empathic Voice Interface (EVI) replies in a friendly, almost-human voice while my declaration of love appears transcribed and analysed on the screen: 1 (out of 1) for “love”, 0.642 for “adoration”, and 0.601 for “romance”.

While the failure to detect any negative feeling could be down to bad acting on my part, I get the impression more weight is being given to my words than my tone, and when I take this to Cowen, he tells me it’s hard for the model to understand situations it hasn’t encountered before. “It understands your tone of voice,” he says. “But I don’t think it’s ever heard somebody say ‘I love you’ in that tone.”

Perhaps not, but should a truly empathic AI recognise that people rarely wear their hearts on their sleeves? As Robert De Niro, a master at depicting human emotion, once observed: “People don’t try to show their feelings, they try to hide them.”

Cowen says Hume’s goal is only to understand people’s overt expressions, and in fairness the EVI is remarkably responsive and naturalistic when approached sincerely – but what will an AI do with our less straightforward behaviour?

***

Earlier this year, associate professor Matt Coler and his team at the University of Groningen’s speech technology lab used data from American sitcoms including Friends and The Big Bang Theory to train an AI that can recognise sarcasm.

That sounds useful, you might think, and Coler argues it is. “When we look at how machines are permeating more and more human life,” he says, “it becomes incumbent upon us to make sure those machines can actually help people in a useful way.”

Coler and his colleagues hope thcompanieseir work with sarcasm will lead to progress with other linguistic devices including irony, exaggeration and politeness, enabling more natural and accessible human-machine interactions, and they’re off to an impressive start. The model accurately detects sarcasm 75% of the time, but the remaining 25% raises questions, such as: how much licence should we give machines to interpret our intentions and feelings; and what degree of accuracy would that licence require?

Emotional AI’s essential problem is that we can’t definitively say what emotions are. “Put a room of psychologists together and you will have fundamental disagreements,” says McStay. “There is no baseline, agreed definition of what emotion is.”

Most companies still claim you can look at a face and tell whether someone is angry or sad. That’s clearly not the case

Lisa Feldman Barrett

Nor is there agreement on how emotions are expressed. Lisa Feldman Barrett is a professor of psychology at Northeastern University in Boston, Massachusetts, and in 2019 she and four other scientists came together with a simple question: can we accurately infer emotions from facial movements alone? “We read and summarised more than 1,000 papers,” Barrett says. “And we did something that nobody else to date had done: we came to a consensus over what the data says.”

The consensus? We can’t.

“This is very relevant for emotional AI,” Barrett says. “Because most companies I’m aware of are still promising that you can look at a face and detect whether someone is angry or sad or afraid or what have you. And that’s clearly not the case.”

“An emotionally intelligent human does not usually claim they can accurately put a label on everything everyone says and tell you this person is currently feeling 80% angry, 18% fearful, and 2% sad,” says Edward B Kang, an assistant professor at New York University writing about the intersection of AI and sound. “In fact, that sounds to me like the opposite of what an emotionally intelligent person would say.”

Adding to this is the notorious problem of AI bias. “Your algorithms are only as good as the training material,” Barrett says. “And if your training material is biased in some way, then you are enshrining that bias in code.”

Research has shown that some emotional AIs disproportionately attribute negative emotions to the faces of black people, which would have clear and worrying implications if deployed in areas such as recruitment, performance evaluations, medical diagnostics or policing. “We must bring [AI bias] to the forefront of the conversation and design of new technologies,” says Randi Williams, programme manager at the Algorithmic Justice League (AJL), an organisation that works to raise awareness of bias in AI.

So, there are concerns about emotional AI not working as it should, but what if it works too well?

“When we have AI systems tapping into the most human part of ourselves, there is a high risk of individuals being manipulated for commercial or political gain,” Williams says, and four years after a whistleblower’s documents revealed the “industrial scale” on which Cambridge Analytica used Facebook data and psychological profiling to manipulate voters, emotional AI seems ripe for abuse.

As is becoming customary in the AI industry, Hume has made appointments to a safety board – the Hume Initiative – which counts its CEO among its members. Describing itself as a “nonprofit effort charting an ethical path for empathic AI”, the initiative’s ethical guidelines include an extensive list of “conditionally supported use cases” in fields such as arts and culture, communication, education and health, and a much smaller list of “unsupported use cases” that cites broad categories such as manipulation and deception, with a few examples including psychological warfare, deep fakes, and “optimising for user engagement”.

“We only allow developers to deploy their applications if they’re listed as supported use cases,” Cowen says via email. “Of course, the Hume Initiative welcomes feedback and is open to reviewing new use cases as they emerge.”

As with all AI, designing safeguarding strategies that can keep up with the speed of development is a challenge.

Approved in May 2024, the European Union AI Act forbids using AI to manipulate human behaviour and bans emotion recognition technology from spaces including the workplace and schools, but it makes a distinction between identifying expressions of emotion (which would be allowed), and inferring an individual’s emotional state from them (which wouldn’t). Under the law, a call centre manager using emotional AI for monitoring could arguably discipline an employee if the AI says they sound grumpy on calls, just so long as there’s no inference that they are, in fact, grumpy. “Anyone frankly could still use that kind of technology without making an explicit inference as to a person’s inner emotions and make decisions that could impact them,” McStay says.

The UK doesn’t have specific legislation, but McStay’s work with the Emotional AI Lab helped inform the policy position of the Information Commissioner’s Office, which in 2022 warned companies to avoid “emotional analysis” or incur fines, citing the field’s “pseudoscientific” nature.

In part, suggestions of pseudoscience come from the problem of trying to derive emotional truths from large datasets. “You can run a study where you find an average,” explains Lisa Feldman Barrett. “But if you went to any individual person in any individual study, they wouldn’t have that average.”

Still, making predictions from statistical abstractions doesn’t mean an AI can’t be right, and certain uses of emotional AI could conceivably sidestep some of these issues.

***

A week after putting Hume’s EVI through its paces, I have a decidedly more sincere conversation with Lennart Högman, assistant professor in psychology at Stockholm University. Högman tells me about the pleasures of raising his two sons, then I describe a particularly good day from my childhood, and once we’ve shared these happy memories he feeds the video from our Zoom call into software his team has developed to analyse people’s emotions in tandem. “We’re looking into the interaction,” he says. “So it’s not one person showing something, it’s two people interacting in a specific context, like psychotherapy.”

Högman suggests the software, which partly relies on analysing facial expressions, could be used to track a patient’s emotions over time, and would provide a helpful tool to therapists whose services are increasingly delivered online by helping to determine the progress of treatment, identify persistent reactions to certain topics, and monitor alignment between patient and therapist. “Alliance has been shown to be perhaps the most important factor in psychotherapy,” Högman says.

While the software analyses our conversation frame by frame, Högman stresses that it’s still in development, but the results are intriguing. Scrolling through the video and accompanying graphs, we see moments where our emotions are apparently aligned, where we’re mirroring each other’s body language, and even when one of us appears to be more dominant in the conversation.

Insights like these could conceivably grease the wheels of business, diplomacy and even creative thinking. Högman’s team is conducting yet to be published research that suggests a correlation between emotional synchronisation and successful collaboration on creative tasks. But there’s inevitably room for misuse. “When both parties in a negotiation have access to AI analysis tools, the dynamics undoubtedly shift,” Högman explains. “The advantages of AI might be negated as each side becomes more sophisticated in their strategies.”

As with any new technology, the impact of emotional AI will ultimately come down to the intentions of those who control it. As Randi Williams of the AJL explains: “To embrace these systems successfully as a society, we must understand how users’ interests are misaligned with the institutions creating the technology.”

Until we’ve done that and acted on it, emotional AI is likely to raise mixed feelings.

Advertisement