Published Mar 11, 2021

SDS 451: Translating PhD Research into ML Applications — with Dan Shiebler

Delve into Dan Shiebler's experience of merging academic research with industry practice, as he reveals the role of category theory in advancing machine learning algorithms, tackles data challenges at Twitter, and expounds on the transformative power of no-code tools in advertising technology.
Episode Highlights
Super Data Science: ML & AI Podcast with Jon Krohn logo

Popular Clips

Episode Highlights

  • Labeling Challenges

    Labeling data in machine learning models presents unique challenges, especially when dealing with sparse data. highlights the importance of label engineering, a process that involves creating proxy labels to train algorithms under various conditions 1. He explains that at TrueMotion, they faced the challenge of reconciling high-quality labels with limited data, which was crucial for developing effective machine learning models 2.

    Often what will happen is we have these little accelerometers we would strap to cars and we'd drive around and do all sorts of crazy things.

    ---

    Dan's experience underscores the necessity of innovative strategies to overcome data labeling hurdles.

       

    Usage-Based Insurance

    At TrueMotion, worked on usage-based insurance, which uses machine learning to assess driving behavior and determine insurance pricing. This technology integrates into insurance apps, analyzing GPS and motion sensor data to evaluate driver risk and behavior without user input 3. The challenge lies in deriving accurate labels from sparse data, requiring creative solutions to infer larger datasets from limited information.

    Identify when someone is texting and driving based on the motion sensors in the phone, or identify when they're taking a turn too hard.

    ---

    Dan's work illustrates the complexity of developing non-intrusive, accurate systems for real-world applications.

Related Episodes