A simple .mp3 file just heralded a huge step forward for the world of neurological speech synthesis —AKA turning brain waves into words.
In a research article published in Nature, Gopala K. Anumanchipalli, Josh Chartier and Edward F. Chang demonstrate how it is now possible to turn brain wave readings into discernible sentences, under certain conditions.
In an .mp3 file released on the Nature website, you can hear two iterations of the same sentence—one as it was recorded by the person saying it, and one as it was reconstructed by an AI studying the brain waves produced by the person.
In a commentary released alongside the study, Chethan Pandarinath of Emory University writes that the research by Anumanchipalli and his team research “bring us closer to a brain–computer interface (BCI) that can restore speech function”.
The research is designed around the problem of restoring speech function to patients who have experienced neurological disorders that render them unable to speech. One such patient, and the most famous example, is Stephen Hawking, who suffered from Amyotrophic Lateral Sclerosis (ALS). For years, Hawking’s voice was synthesized using technology that studied minor movements in his cheeks and converted them to individual letters—a pace of 10 words a minute. Natural speech, by comparison, can reach up to 150 words per minute.
Notably, the study utilized a deep-learning AI to learn how the brain converts signals to speech. They studied five patients who had electrodes implanted onto their brains as part of epilepsy treatments, and studied their brain waves as well as the movements of their tongue, lips, jaw and larynx as they read sentences out aloud. This gave the AI one dataset to study and analyse the relation of brain waves and speech with. The researchers then repeated the experiment with the patients then “miming” their speech without producing sound.
In both cases, the AI was able to effectively reproduce human speech from the combination of brain activity and vocal cord movements, creating comprehensible sentences 70 percent of the time. However, there is still a way to go before it can translate speech produced solely by thinking into discernible sentences.
The University of California was the first institution to research Brain Computer Interfaces, as far back as 1973. In a pioneering study titled “Towards direct brain computer communication”, Jacques J. Vidal wrote about the potential of monitoring “evoked responses” to environmental stimuli. His work is widely considered as the first instance of a BCI.
You can see a visualization of how the researchers studied vocal cords and brain waves here