AI and the 'cocktail party problem'

The human ear can naturally filter out background noise. Now technology has been developed to do the same

Floating speech bubbles
(Image credit: Melpomenem / Getty Images)

People struggling to follow a conversation in noisy situations could soon be helped by artificial intelligence after a technological breakthrough that claimed to have solved the "cocktail party problem".

The phenomenon describes how people can filter out background noises, such as the chatter of a party, to focus on one particular sound or speaker. Scientists have long puzzled over how the human brain is able to do this, leading Tech Crunch to call it "one of the greatest barriers to voice technologies reaching a level of understanding comparable to humans".

Voice technologies, added the website, are a growing market expected to reach $26.8 billion (£20.4 billion) by next year. However, they are not being designed to confront the "messiness" or "cacophony" of real life, in particular the background and ambient noise that "muddies" the signals they receive. The only way to combat this, said Tech Crunch, is to find a way to make voice tech as good as the human auditory system.

Subscribe to The Week

Escape your echo chamber. Get the facts behind the news, plus analysis from multiple perspectives.

SUBSCRIBE & SAVE
https://cdn.mos.cms.futurecdn.net/flexiimages/jacafc5zvs1692883516.jpg

Sign up for The Week's Free Newsletters

From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.

From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.

Sign up

AI's day in court

As well as causing difficulties in social situations, the cocktail party problem also has legal implications, said the BBC. Technology's inability to filter out background noise can affect audio evidence in legal cases, if listeners cannot be completely certain who is talking and what is being said.

Electrical engineer Keith McElveen, founder and chief technology officer of US company Wave Sciences, told the broadcaster it was "one of the classic hard problems in acoustics".

McElveen originally became interested in the problem when working for the US government investigating a possible war crime. "Some of the evidence included recordings with a bunch of voices all talking at once – and that's when I learned what the 'cocktail party problem' was," he said.

The issue was that sounds bounced around the room and made isolating a particular noise "mathematically horrible to solve". He hit upon the idea of using AI to "pinpoint and screen out" background voices and ambient noises based on where they originated in the room.

It took researchers at Wave Sciences 10 years of testing to "finally" create an AI system that could analyse how sound bounces around the room before it reaches an ear or a mic. The result is similar to a camera focussing on a subject and blurring out the rest of the image.

The technology was put to the test in a US court case, turning an audio recording into a "pivotal piece of evidence", and is now being used by the military. Future uses could include smart speakers and hearing aid devices, added the BBC.

Elizabeth Carr-Ellis is a freelance journalist and was previously the UK website's Production Editor. She has also held senior roles at The Scotsman, Sunday Herald and Hello!. As well as her writing, she is the creator and co-founder of the Pausitivity #KnowYourMenopause campaign and has appeared on national and international media discussing women's healthcare.