From Particle Physics to Audio AI with Scott Stephenson - #19

Topics covered
Popular Clips
Episode Highlights
Neural Indexing
Deepgram's innovative approach to audio processing involves indexing activations deep within neural networks rather than relying on text. explains that this method allows for more accurate searches by focusing on phoneme-like structures defined by data rather than human-assigned phonemes 1. This approach enhances the network's ability to recognize speech patterns and improves accuracy by dynamically adjusting to the data 2.
You're indexing, you're building an index out of activations deep in a neural network.
---
The use of declarative neural networks, facilitated by the Kur framework, further supports this process by allowing users to define models in a flexible and accessible manner 3.
Audio Search
Deepgram revolutionizes audio search by achieving high accuracy in identifying relevant audio segments. shares that their technology can find desired audio content with up to 90% accuracy, a significant improvement over traditional methods 4. This capability is particularly useful for applications like podcast indexing, where users can quickly locate specific topics or mentions within vast audio libraries 5.
We went from very poor accuracy, meaning, like, maybe 20% of the time you'll find what you're looking for to 80 or 90% of the time finding what you're looking for.
---
By treating audio spectrograms as images, Deepgram's system can efficiently process and search through large audio datasets, offering a transformative experience for users seeking specific information.
Deep Speech
Deepgram's deep speech models, inspired by Baidu's Deep Speech, are applied across various tasks, including fraud detection and quality assurance. notes that these models use convolutional and recurrent layers to process audio data, similar to the architecture of Deep Speech networks 6. This approach allows businesses to analyze vast amounts of audio data for patterns, such as identifying fraudulent calls in financial services 6.
The models that we use to build our indexes and to ingest audio are extremely similar to the deep speech networks.
---
Additionally, the open-source Kur framework supports these applications by providing a flexible platform for developing and deploying neural networks 7.
Related Episodes


Building AI Voice Agents with Scott Stephenson - 707
Answers 383 questions

The Physics of Data with Alpha Lee - #377
Answers 383 questions

Evolving AI Systems Gracefully with Stefano Soatto - #502
Answers 383 questions

AI for Materials Discovery with Greg Mulholland - #148
Answers 383 questions

Agile Data Science with Sarah Aerni - #143
Answers 383 questions

NLP for Mapping Physics Research with Matteo Chinazzi - #353
Answers 383 questions

Deep Learning for Live-Cell Imaging with David Van Valen - #141
Answers 383 questions

Deep Learning with Structured Data w/ Mark Ryan - #301
Answers 383 questions

Global AI Trends with Ben Lorica - #26
Answers 383 questions

Machine Learning for the Stars & Productizing AI with Joshua Bloom - #5
Answers 383 questions

Philosophy of Intelligence with Matthew Crosby - #91
Answers 383 questions

Building AI Products with Hilary Mason - #11
Answers 383 questions














