Published Mar 11, 2021

SDS 451: Translating PhD Research into ML Applications — with Dan Shiebler

Delve into Dan Shiebler's experience of merging academic research with industry practice, as he reveals the role of category theory in advancing machine learning algorithms, tackles data challenges at Twitter, and expounds on the transformative power of no-code tools in advertising technology.

Episode Highlights

Topics covered

Episode Highlights

Labeling Challenges

Labeling data in machine learning models presents unique challenges, especially when dealing with sparse data. highlights the importance of label engineering, a process that involves creating proxy labels to train algorithms under various conditions 1. He explains that at TrueMotion, they faced the challenge of reconciling high-quality labels with limited data, which was crucial for developing effective machine learning models 2.

Often what will happen is we have these little accelerometers we would strap to cars and we'd drive around and do all sorts of crazy things.

---

Dan's experience underscores the necessity of innovative strategies to overcome data labeling hurdles.

Usage-Based Insurance

At TrueMotion, worked on usage-based insurance, which uses machine learning to assess driving behavior and determine insurance pricing. This technology integrates into insurance apps, analyzing GPS and motion sensor data to evaluate driver risk and behavior without user input 3. The challenge lies in deriving accurate labels from sparse data, requiring creative solutions to infer larger datasets from limited information.

Identify when someone is texting and driving based on the motion sensors in the phone, or identify when they're taking a turn too hard.

---

Dan's work illustrates the complexity of developing non-intrusive, accurate systems for real-world applications.

Related Episodes

717: Overcoming Adversaries with A.I. for Cybersecurity — with Dr. Dan Shiebler
Answers 383 questions
630: Resilient Machine Learning — with Dan Shiebler
Answers 383 questions
SDS 435: Scaling Up Machine Learning — with Erica Greene
Answers 383 questions
829: Neuroscience Fueled by ML — with Prof. Bradley Voytek
Answers 383 questions
SDS 433: Data Science Trends for 2021 — with Ben Taylor
Answers 383 questions
SDS 513: Transformers for Natural Language Processing — with Denis Rothman
Answers 383 questions
SDS 558: @JonKrohnLearns's Answers to Questions on Machine Learning
Answers 383 questions
SDS 623: Data Analyst, Data Scientist, and Data Engineer Career Paths — with @ShashankData
Answers 383 questions
SDS 439: Deep Learning for Machine Vision — with Deblina Bhattacharjee
Answers 383 questions
SDS 573: Automating ML Model Deployment — with Doris Xin
Answers 383 questions
SDS 605: Upskilling in Data Science and Machine Learning — with Kian Katanforoosh
Answers 383 questions
SDS 539: Interpretable Machine Learning — with Serg Masís
Answers 383 questions
SDS 549: Engineering Natural Language Models — with Lauren Zhu
Answers 383 questions
SDS 587: Data Engineering for Data Scientists — with Mark Freeman
Answers 383 questions
SDS 564: Clem Delangue on Hugging Face and Transformers
Answers 383 questions

SDS 451: Translating PhD Research into ML Applications — with Dan Shiebler

Topics covered

Popular Clips

Episode Highlights

Academic PursuitsDan Shiebler shares his journey of balancing a PhD with a full-time role at Twitter, exploring the synergies between his academic research and professional work. His focus on category theory in machine learning offers innovative insights into algorithm development.

Academic Pursuits

Data ChallengesDan Shiebler navigates the complexities of data labeling and usage-based insurance in machine learning. His insights reveal the innovative strategies required to handle sparse data and develop effective, non-intrusive systems.

Data Challenges

Labeling Challenges

Usage-Based Insurance

Career ProgressionDan Shiebler shares insights into his role as a staff engineer at Twitter, highlighting the responsibilities and strategic influence involved. He also discusses the hiring criteria at Twitter and the balance between research and product-focused roles.

Career Progression

Machine Learning Applications

Related Episodes