- In two new studies, people formerly unable to speak have been able to use AI to regain their voice digitally.
- The people in the studies had lost the ability to communicate with their voice due to stroke or due to ALS.
- Brain-computer interfaces read brain activity related to speech and feed the data into a language-learning model.
Brain implants, powered by artificial intelligence, are improving rapidly and giving those who have lost their ability to speak a voice again.
In a pair of studies published this week in the
The BCIs read brain activity related to speech and feed the data into a language learning model, which is then output in usable speech either through on-screen text or computer-generated voice.
But her brain is still working: it is still sending signals down those pathways, trying to wake up her mouth and tongue and produce speech. But there’s a disconnect somewhere down the line. Stanford researchers have now, essentially, cut out the middleman by implanting popcorn-kernel size electrode arrays onto the speech motor cortex of the brain. This device, a BCI, then interfaces with computer software that allows her to speak.
Erin Kunz, a PhD student at Stanford University’s Wu Tsai Neurosciences Institute, and co-author of the research paper, was there when Pat spoke for the first time.
“She was thrilled,” Kunz told Healthline. “We’ve done almost, I think we’ve done 30-plus days of running this with her and even after day thirty, it’s still just as exciting seeing it in real time.”
Their work has come a long way. The BCI they use today along with artificial intelligence that learns from language patterns, allow Bennet to speak quickly and accurately, relatively speaking. The team says they’ve achieved a 9.1% word error rate, using a smaller 50-word vocabulary — 2.7 times more accurate than previous state-of-the-art BCIs — and a 23.8% word error rate on a 125,000-word vocabulary. The algorithm they use to take brain signals and turn them into a speech output is able to decode 62 words per minute, more than three times as fast as previous models, and approaching conversational speed of 160 words per minute.
While it is still early, the research demonstrates a proof-of-concept and also a significant improvement over previous iterations of the technology. Kunz hopes their work will eventually give people like Pat more autonomy and improve their quality of life, their friendships, and maybe even allow them to work again.
Researchers at UCSF are working with Ann, who at the age of 30, suffered a
Today Ann has regained some function: she can laugh and cry. She can move her head. But the team at UCSF has a much more ambitious goal: give her the ability to speak again, but with her own voice.
Dr. David Moses, PhD, an adjunct professor at UCSF in the Department of Neurological Surgery who worked with Ann told Healthline, “It was really moving to see the culmination of all the efforts, our efforts of her efforts, and to see the system being able to recognize more tricky sentences. We were all very excited.”
Moses was previously part of an effort that successfully translated the brain signals of Pancho, a man who had become paralyzed due to a brainstem stroke, into text, demonstrating that brain signals could be decoded into words. Their work was published in 2021.
Building on that, Moses says that the technology has come a long way, specifically regarding the array that sits on top of the brain reading its activity. After working with Pancho, the team upgraded their array from 128 channels to 253 channels, which Moses describes as similar to improving the resolution of what you might see on video that is now in high definition.
“You just get a cleaner vision of what’s going on in there,” he told Healthline. “We quickly saw results that were really kind of blowing us away.”
Using AI algorithms to recognize brain activity and speech patterns, the team managed to produce 78 words per minute with a median word-error rate of 25.5% using on-screen text. Using a smaller vocabulary set, Ann was able to “speak” 50 “high utility” sentences composed of 119 unique words quickly and with an error rate of 28%.
But UCSF has also developed a supplemental mode of communication: a digital avatar to produce facial expressions and speech gestures that might not otherwise be possible on Ann’s own face. The voice too is personalized to sound like Ann before her injury by training it on videos of her wedding.
The avatar could one day assist in communication and expression both in the real and virtual world, according to Moses.
“It may seem silly or somewhat trivial for you to be in a virtual environment, but for people who are paralyzed, it might not be trivial. It would be potentially pretty expanding for people who are locked in and can’t freely move and freely talk,” he told Healthline.
Ann, who hopes to one day be able to counsel others who have dealt with catastrophic injuries, likes the idea of using an avatar to communicate.
Moses admits that the technology can feel a bit “sci-fi”, but their team has only one goal in mind: helping patients.
“We’re laser-focused on that first step,” he told Healthline.
Speech devices are not a new technology. Perhaps the most famous example of one such device was that used by Stephen Hawking, the renowned astrophysicist diagnosed with ALS. In fact, Hawking himself became known for his voice, with his robotic tone becoming a part of his identity. But, while Hawking’s device and these new technologies may appear similar on the surface, like an iceberg there is a deep level of technologic sophistication that separates them.
Depending on the level of paralysis, those with ALS or other forms of neurological damage may still be able to use their hands and fingers for communication — texting on a cell phone for example. However, those with near or complete paralysis may have to rely on a muscle-triggered communication device.
People with full paralysis or locked-in syndrome might have to rely on “eye-gaze devices,” a technology that uses a computer to track eye movements to activate letters or words on a screen, which can then be read or spoken aloud by a device. While the technology is effective, there are problems with it that make it difficult to use. Although minimal, these devices do require the user to be able to move their eyeballs with some accuracy, meaning that in severe cases they might not work. However, the larger issue is the time component. Communicating using an eye-gaze device is slow — it’s functional, but far from conversational.
That is one of the factors that separates these new technologies: their speed. The latest research from Stanford and UCSF demonstrates that using a BCI, conversation can happen now in seconds, rather than minutes.
Though these technologies are still far from approval, the proof of concept has instilled hope in many that someday BCI’s could help restore speech to those afflicted with severe paralysis.
Kuldip Dave, PhD, Senior Vice President of Research at the ALS Association, who wasn’t affiliated with the research at Stanford or UCSF, told Healthline,
“Technologies like brain-computer interface can allow a person to communicate, access a computer or control a device using their brainwaves and have the potential to improve quality of life. These recent studies are an important step in developing and validating this emerging technology to create faster, more reliable BCI systems. The ALS Association is committed to supporting the continued development of novel assistive technologies like BCI through our Assistive Technology Grants. “
Brain-computer interface technology assisted with language learning AI allows paralyzed individuals to speak by reading brain activity and decoding it into speech.
Research teams at Stanford and UCSF both saw signifcant improvements in vocabulary size, speed of language decoding, and accuracy of speech in their latest research.
The proof-of-concept technology, although promising, is still far from FDA approval.