Published May 2, 2023

675: Pandas for Data Analysis and Visualization — with Stefanie Molin

Data scientist Stefanie Molin delves into sophisticated data wrangling techniques using Pandas, revealing the power of chaining operations and the efficiency of the assign method, while also offering insightful guidance on leveraging Pandas, Matplotlib, and Seaborn for effective data visualizations.
Episode Highlights
Super Data Science: ML & AI Podcast with Jon Krohn logo

Popular Clips

Episode Highlights

  • Pandas Plotting

    Pandas offers a surprising range of plotting capabilities, making it a convenient choice for quick visualizations. explains that while Pandas is not typically seen as a go-to visualization library, it can handle a variety of tasks, especially when working with wide-format data 1. However, for more complex visualizations, other tools like Matplotlib or Seaborn may be necessary 2.

    People don't always think of pandas as their go-to visualization library, but it can do a surprising amount.

    ---

    Stefanie also highlights the importance of understanding the limitations and strengths of each tool to make informed decisions in data visualization 1.

       

    Matplotlib Features

    Matplotlib offers advanced control over visualizations, making it ideal for detailed customization. and Stefanie discuss the Ticker module, which simplifies formatting and placing ticks on plots, enhancing the clarity of data presentation 3. This module is particularly useful for visualizing quantities with specific units or normalizing data based on constant factors.

    The ticker module makes it easier to format and place ticks in your plots, ultimately making it easier for you to convey some specific concept to your viewer.

    ---

    Stefanie emphasizes that Matplotlib's flexibility allows for more precise adjustments compared to higher-level libraries like Pandas 4.

       

    Seaborn Advantages

    Seaborn excels in creating aesthetically pleasing and functional visualizations with minimal effort. Stefanie notes that Seaborn is particularly effective for handling long-format data and adding color to plots, tasks that can be cumbersome in Pandas 5. The library's built-in styles and themes make it easier to produce visually appealing plots without extensive customization.

    Seaborn makes it very easy to handle colors and create aesthetically pleasing visuals with minimal effort.

    ---

    Additionally, Stefanie's workshops often cover the transition from using Pandas for basic plotting to leveraging Seaborn for more complex visualizations 6.

Related Episodes