Published Apr 14, 2017

From Particle Physics to Audio AI with Scott Stephenson - #19

Join Scott Stephenson as he explores his fascinating journey from particle physics to pioneering audio AI at Deepgram, unveiling groundbreaking technologies in audio indexing and neural network search, and shedding light on Kur, a community-driven framework simplifying deep learning model development.

Episode Highlights

Topics covered

Episode Highlights

Neural Indexing

Deepgram's innovative approach to audio processing involves indexing activations deep within neural networks rather than relying on text. explains that this method allows for more accurate searches by focusing on phoneme-like structures defined by data rather than human-assigned phonemes 1. This approach enhances the network's ability to recognize speech patterns and improves accuracy by dynamically adjusting to the data 2.

You're indexing, you're building an index out of activations deep in a neural network.

---

The use of declarative neural networks, facilitated by the Kur framework, further supports this process by allowing users to define models in a flexible and accessible manner 3.

Audio Search

Deepgram revolutionizes audio search by achieving high accuracy in identifying relevant audio segments. shares that their technology can find desired audio content with up to 90% accuracy, a significant improvement over traditional methods 4. This capability is particularly useful for applications like podcast indexing, where users can quickly locate specific topics or mentions within vast audio libraries 5.

We went from very poor accuracy, meaning, like, maybe 20% of the time you'll find what you're looking for to 80 or 90% of the time finding what you're looking for.

---

By treating audio spectrograms as images, Deepgram's system can efficiently process and search through large audio datasets, offering a transformative experience for users seeking specific information.

Deep Speech

Deepgram's deep speech models, inspired by Baidu's Deep Speech, are applied across various tasks, including fraud detection and quality assurance. notes that these models use convolutional and recurrent layers to process audio data, similar to the architecture of Deep Speech networks 6. This approach allows businesses to analyze vast amounts of audio data for patterns, such as identifying fraudulent calls in financial services 6.

The models that we use to build our indexes and to ingest audio are extremely similar to the deep speech networks.

---

Additionally, the open-source Kur framework supports these applications by providing a flexible platform for developing and deploying neural networks 7.

Related Episodes

Building AI Voice Agents with Scott Stephenson - 707
Answers 383 questions
Machine Learning to Discover Physics and Engineering Principles with Nathan Kutz - #162
Answers 383 questions
The Physics of Data with Alpha Lee - #377
Answers 383 questions
Evolving AI Systems Gracefully with Stefano Soatto - #502
Answers 383 questions
AI for Materials Discovery with Greg Mulholland - #148
Answers 383 questions
Agile Data Science with Sarah Aerni - #143
Answers 383 questions
NLP for Mapping Physics Research with Matteo Chinazzi - #353
Answers 383 questions
Deep Learning for Live-Cell Imaging with David Van Valen - #141
Answers 383 questions
Deep Learning with Structured Data w/ Mark Ryan - #301
Answers 383 questions
Open Source Data Science Masters, Hybrid AI, Algorithmic Ethics & More with Clare Corthell - #1
Answers 383 questions
Global AI Trends with Ben Lorica - #26
Answers 383 questions
Machine Learning for the Stars & Productizing AI with Joshua Bloom - #5
Answers 383 questions
Philosophy of Intelligence with Matthew Crosby - #91
Answers 383 questions
Building AI Products with Hilary Mason - #11
Answers 383 questions
Machine Learning for Signal Processing Applications with Stuart Feffer & Brady Tsai - #105
Answers 383 questions

From Particle Physics to Audio AI with Scott Stephenson - #19

Topics covered

Popular Clips

Episode Highlights

Physics to AI TransitionScott Stephenson, co-founder and CEO of Deepgram, discusses his transition from particle physics to developing AI for audio indexing and searching. His work in underground labs searching for dark matter laid the foundation for innovative applications in audio AI.

Physics to AI Transition

Kur FrameworkScott Stephenson, co-founder of Deepgram, discusses Kur, a deep learning framework that simplifies model building through a declarative approach. He highlights its open-source nature, fostering community collaboration and innovation.

Kur Framework

Deepgram Technology

Neural Indexing

Audio Search

Deep Speech

Related Episodes