Published Oct 20, 2023

724: Decoding Speech from Raw Brain Activity — with Dr. David Moses

Explore the cutting-edge BRAVO system with Dr. David Moses, which revolutionizes brain-computer interfaces to restore speech in patients who cannot speak, using advanced machine learning to decode brain activity and provide new communication avenues for the severely speech-impaired.
Episode Highlights
Super Data Science: ML & AI Podcast with Jon Krohn logo

Popular Clips

Episode Highlights

  • ML Models

    The BRAVO project employs sophisticated machine learning models to decode speech from brain activity. explains that the process begins with a silent speech model, where participants attempt to silently articulate sentences, providing training data that includes brain activity without acoustics 1. This data is then processed using language model techniques to convert brain signals into sentences. Moses highlights the use of lexical constraints and language models to generate and rescore potential sentences, leveraging existing natural language processing technologies 2.

    One of the beauties of our approach is that it's the same training data that we can use to train all three models.

    ---

    These models are trained on servers and run in real-time alongside participants, showcasing the integration of advanced machine learning with neuroscience.

       

    Speech Synthesis

    Speech synthesis in the BRAVO project involves translating brain activity into spoken words through a series of complex processes. describes how brain signals are mapped to discrete units, akin to phonemes, which are then used to generate speech waveforms 3. This involves using models like Hubert units to process acoustics and reconstruct speech sounds, bypassing traditional language models due to the acoustic nature of the task 4.

    It's brain activity to these special units that are kind of a compressed representation of speech sound.

    ---

    The process culminates in synthesizing speech in a personalized voice, demonstrating a significant advancement in speech neuroprosthetics.

       

    Real-Time

    The real-time implementation of the BRAVO system involves multiple machine learning models operating simultaneously to decode speech. outlines how separate models are used for text, speech sounds, and avatar animation, each trained on similar structures but targeting different outputs 5. This approach allows for the concurrent generation of text, speech, and visual outputs, enhancing the communication capabilities of paralyzed patients.

    We did train for this three separate machine learning models.

    ---

    The system achieves impressive results, such as predicting 75 words per minute with 75% accuracy, showcasing the potential of integrating neural networks with real-time applications 6.

Related Episodes