Saving Hawking’s voice
In 2014, a Silicon Valley engineer got an unexpected call, said journalist Jason Fagone. Could he help rescue physicist Stephen Hawking’s distinctive voice before aging technology lost it forever?
Eric Dorsey, a 62-year-old engineer in Palo Alto, Calif., was watching TV in mid-March when he started getting texts that Stephen Hawking had died. He turned on the news and saw clips of the famed physicist speaking in his iconic android voice—the voice that Dorsey had spent so much time as a young man helping to create, and then, much later, to save from destruction.
Dorsey and Hawking had first met 30 years earlier, nearly to the day. In March 1988, Hawking was visiting the University of California, Berkeley, during a three-week lecture tour.
At 46, Hawking was already famous for his discoveries about quantum physics and black holes, but not as famous as he was about to be. His best-seller, A Brief History of Time, was a week away from release, and Californians were curious about this British professor from the University of Cambridge, packing the seats of his public talks, approaching him at meals.
When Hawking spoke, it was in the voice of a robot, a voice that emerged from a gray box fixed to the back of his motorized wheelchair. The voice synthesizer, a commercial product known as the CallText 5010, was a novelty then, not yet a part of his identity; he’d begun using it just three years before, after the motor-neuron disease amyotrophic lateral sclerosis stole his ability to speak. Hawking selected bits of text on a video screen by moving his cheek, and the CallText turned the text into speech. At the start of one lecture, Hawking joked about it: “The only problem,” he said, to big laughs, “is that it gives me an American accent.”
Dorsey was with Hawking for part of that trip, tagging along as a sort of authority on the voice, explaining its workings to journalists. He worked at the Mountain View, Calif., company that manufactured the CallText 5010, a hardware board with two computer chips running custom software.
An upbeat 32-year-old, Dorsey was quiet by nature, but driven. He had joined Speech Plus as an intern, attracted by its mission to help the voiceless and the disabled; now he led a team of engineers, and at least 20,000 lines of his own code were in the CallText.
At the end of his California tour, the physicist gave Dorsey a signed copy of his new book, his thumbprint pressed onto the inside cover. Hawking returned to Cambridge, Dorsey to his life in California. Twenty-six years went by before they would cross paths again.
In tech years, that is a millennium. The internet happened. Silicon Valley boomed, busted, boomed again. Apple, Amazon, Facebook, Google, Uber.
Dorsey, meanwhile, left Speech Plus, which went bankrupt and was sold to a series of other companies. He got married and had kids. He joined a Buddhist temple. He eventually left the field of speech technology altogether, becoming an engineering leader at DVR maker TiVo.
Tech, he’d learned, moves so fast. “There’s a new iPhone every year,” Dorsey says. “Everything just kind of gets buried in the dustbin of history very, very quickly.”
That’s why, when an email from Cambridge University arrived out of the blue in 2014, Dorsey was surprised. It came from Hawking’s technical assistant, Jonathan Wood, who was responsible for Hawking’s communications systems.
Wood explained something so improbable that Dorsey had trouble understanding at first: Hawking was still using the CallText 5010 speech synthesizer, a version last upgraded in 1986. In nearly 30 years, he had never switched to newer technology. Hawking liked the voice just the way it was, and had stubbornly refused other options. But now the hardware was showing wear and tear. If it failed, his distinctive voice would be lost to the ages.
The solution, Wood believed, was to replicate the decaying hardware in new software, to somehow transplant a 30-year-old voice synthesizer into a modern laptop—without changing the sound of the voice. What did Dorsey think?
Thirty years old? he thought. Oh, my God.
It wouldn’t be easy. They might have to locate the old source code. They might have to find the original chips and the manuals for those chips. They couldn’t buy them anymore, the companies don’t exist. Solving the problem might mean mounting an archaeological dig through an antiquated era of technology.
But it was for Stephen Hawking. “Let’s get it done,” Dorsey said.
Today’s synthesized voices, like Apple’s Siri, rely on prerecorded libraries of natural sound. Voice actors record huge libraries of words and syllables, and software chops them up and reassembles them into sentences on the fly. But 30 years ago, computers could produce only a “stick-figure version” of a human voice, says Patti Price, a speech recognition specialist in Palo Alto.
Back then, she worked as a postdoc in the Massachusetts Institute of Technology lab of Dennis Klatt, a tall, thin, opera-loving scientist originally from Wisconsin. Klatt is the godfather of Hawking’s voice. He blasted his own throat with X-rays to measure the shape of his voice box as he articulated certain sounds and then developed a software model of speech, the Klatt Model, based on his own voice.
Speech Plus took Klatt’s model, improved on it, and commercialized it, including the CallText 5010. One of Dorsey’s contributions was to write an algorithm that controlled the intonation of the voice, the rise and fall of words and sentences. Speech Plus would sell thousands of CallText systems, though many customers complained that the voice sounded too robotic.
But Hawking liked it. True, it was robotic, but he appreciated that it was easy to understand: “noise-robust,” as Price explains. The shape of its waveform was more like a series of plateaus than the steep mountain cliffs of human voices, which fall off more sharply. The flattish slope of Hawking’s voice made it cut through noise in amphitheaters and lecture halls. “It’s very intelligible,” Dorsey says. “You can listen to it for a long time, and it’s not irritating.”
Over the years, Hawking had chances to upgrade. In 1996, a Massachusetts speech technology company called Nuance, which had acquired the remains of Speech Plus, upgraded the CallText with evolved software code that made the voice sound fuller and faster, less robotic, with shorter pauses between sentences.
They sent Hawking a sample of the new voice, thinking he’d be pleased. He was not. He said the intonation wasn’t right. He preferred the 1986 voice.
Starting around 2009, Wood and several others at Cambridge began trying to separate Hawking’s voice from the failing CallText hardware. The group would include Peter Benie, a computer guru at the university; Pawel Wozniak, a local engineering student; and Mark Green, an experienced electrical engineer with Intel.
One option they considered was tweaking a modern synthetic voice like Siri to sound more like Hawking. But Siri-type systems rely on the vast computer power of internet clouds, and Hawking couldn’t be constantly tethered to the internet. Benie also tried a completely different approach. He wrote a software emulator for the CallText—essentially a program that would fool a modern PC into thinking it was actually the old CallText. But the samples it produced didn’t sound faithful enough for Hawking’s taste.
By the time Cambridge reached out to Dorsey in 2014, they were investigating a third avenue: track down the old CallText source code, now owned by Nuance, and port it to Hawking’s laptop, transplanting the old voice into a fresh new body.
Was it possible? Dorsey had no idea. It depended on whether he could find the source code, or, failing that, information that would let him reverse-engineer the source code.
He started emailing colleagues he hadn’t seen in 30 years, asking if they had any CallText bric-a-brac still lying around: boards, chips, manuals. One guy found an actual CallText board in his garage. Others located dusty schematics.
It had the feel of a mad scramble through an earlier era of technology. But people everywhere leaped at the chance to help. “The goal is to save his voice,” Dorsey said. “Once you go to somebody—‘I need you to help save Stephen Hawking’s voice’ — they immediately wake up.”
Dorsey’s archaeological quest for old code turned out to be a frustrating one. No one at Nuance was able to find the source code from the 1986 version of CallText. They did, however, find the code for the upgraded 1996 version of the voice, on a backup tape in an office in Belgium. After a few months of work, Nuance engineers got the code up and running and sent a series of audio samples to Hawking’s team, adjusting the program to try to match the 1986 voice.
It didn’t quite work. The match was close but not perfect. Hawking flagged subtle differences others had trouble discerning. “It’s like recognizing your mother’s voice,” Price said. “When you hear them over the phone, you know if that’s right or not.”
At this point, they switched tacks and returned to one of their original ideas: to emulate the CallText in software, similar to how PCs can emulate old Nintendo games that aren’t sold anymore.
The CallText, of course, was a more intricate beast than a Nintendo, driven by two obsolete and complexly interacting chips, one made by Intel and the other by NEC. Building the emulator demanded heroic feats of programming, intuition, and high-tech surgery. The chips had to be removed from a spare CallText board with tweezers and a screwdriver. An emulator for the Intel chip had to be written from scratch, by Benie. A separate emulator, for the NEC, was borrowed from an open-source Nintendo emulator. Then all these disparate pieces had to be glued together.
The breakthrough came just before Christmas 2017, when the emulator finally started producing sounds that resembled the familiar voice they had been chasing. It had some minor glitches, but the voice was an acoustical match to Hawking’s, the waveforms virtually identical.
On Jan. 17, the team felt ready to demonstrate the new voice for Hawking. Wood, Wozniak, and Benie went to Hawking’s home in Cambridge and played him samples on a Linux laptop. To the team’s relief and happiness, Hawking gave his blessing.
They still needed to port the voice to the PC, so temporarily, Wood loaded a version onto a miniature hardware board known as a Raspberry Pi. He thought Hawking might want to evaluate the voice in everyday life, and the Pi was the quickest way to get him up and running.
On Jan. 26, Wood took the Pi along to Hawking’s house and asked if he’d like to try it out. Hawking raised his eyebrows, which meant “yes.” The team put the Pi in a tiny black box, attached it to Hawking’s chair with Velcro, and plugged it into the voice box. Then they disconnected the CallText. For the first time in 33 years, Hawking was able to speak without it.
Wood watched eagerly for Hawking’s reaction. “I love it,” Hawking said.
For the next few weeks in private conversations, Hawking continued to speak through the emulator and the Raspberry Pi, chatting happily with friends and colleagues. All that remained, the final step in the project, was to get the PC version, still a bit buggy, working smoothly. But after a few more code revisions, it was finally bug-free.
And that is when Hawking got sick, in February.
According to Wood, Hawking continued using the emulator until his final days. He was able to talk with his loved ones and caregivers with the new software.
The original CallText boards have passed to Hawking’s estate, to use as his family wishes. So has the new software, the CallText emulator, which can be ported to future platforms as they are invented.
Hawking was, famously, an atheist, skeptical of the afterlife; “We have this one life to appreciate the grand design of this universe,” he once said, “and for that, I am extremely grateful.” But there is no longer any physical reason his voice can’t live forever.
Excerpted from an article that originally appeared in the San Francisco Chronicle. Reprinted with permission.