ZURICH: Fake or real? It is increasingly difficult to tell whether it’s a human or an AI-generated voice that is speaking – at least consciously.
Researchers have observed that our brains reacts differently to deepfake voices than to natural ones, even though we may not be fully aware of this.
Fake voices appear to lead to less pleasure when listening, according to the study published in the journal Communications Biology.
Algorithms for voice synthesis are now so powerful that the characteristics of voice clones come very close to those of natural speakers.
Voices imitated with deepfake technologies are being used in fraud attempts over the telephone and to give virtual assistants the voice of a celebrity.
The team led by Claudia Roswandowitz from the University of Zurich analysed how well human identity is preserved in voice clones. The researchers recorded the voices of four German-speaking men in 2020 and then used computer algorithms to generate deepfake voices of these speakers.
Deepfaked voices already pretty perfect
The study then tested how good the imitation was, ie. how convincingly the identity was cloned. To do this, 25 test subjects were asked to decide whether the identity of two pre-recorded voices was identical or not.
In around two-thirds of the tests, the deepfake voices were correctly assigned to the respective speaker.
“This makes it clear that although current deepfake voices do not perfectly imitate identity, they have the potential to deceive people’s perception,” said Roswandowitz.
The researchers then used functional magnetic resonance imaging (fMRI) to investigate how individual areas of the brain react to fake and real voices.
According to the results, there were differences in two central areas: the nucleus accumbens and the auditory cortex. The researchers believe it is likely that both areas play an important role in whether a person recognises a deepfake voice as fake or not.
“The nucleus accumbens is an important part of the reward system in the brain,” Roswandowitz says. It was less active when a deepfake and a natural voice were compared than when two real voices were compared.
In other words, listening to a fake voice activates less of our brain’s reward system.
The brain tries to compensate deepfake flaws
According to the study, there was also a difference in activity in the auditory cortex, which is responsible for analysing sounds.
This area was more involved when it came to recognising the identity of deepfake voices. “We suspect that this area reacts to the imperfect acoustic imitation of deepfake voices and tries to compensate for the missing acoustic signal,” said Roswandowitz.
The cortex probably compensated largely in secret. “Something then signals to the conscious mind that something is different and more difficult, but this often remains below the threshold of perception.”
The rapid development of AI technologies has led to a massive increase in the creation and dissemination of deepfakes, the researchers note.
So would today’s deepfakes, created four years later, completely trick listeners? Or would the results be similar?
“That’s a very exciting question,” says Roswandowitz. Newer AI-generated voices would probably have a slightly better sound quality.
Roswandowitz assumes that the differences in activity in the auditory cortex would be smaller than at the time when this study was conducted.
This is because this region reacts to the different sound quality. In the nucleus accumbens, on the other hand, she expects possibly similar results. “It would be very interesting to investigate this.” – dpa