Published Dec 14, 2021

SDS 531: Data Science at the Command Line — with Jeroen Janssens

Join Jon Krohn as he delves into the world of data science at the command line with Jeroen Janssens, exploring his journey from academia to business and the transformative power of command line tools in crafting a versatile, future-ready data science toolkit.
Episode Highlights
Super Data Science: ML & AI Podcast with Jon Krohn logo

Popular Clips

Episode Highlights

  • Custom Tools

    Creating custom command line tools can significantly enhance data science workflows by allowing users to abstract repetitive tasks into efficient scripts. explains that the command line serves as a universal interface, enabling the integration of tools written in various languages like Python, R, and JavaScript 1. This flexibility allows data scientists to expand their toolboxes and collaborate across different programming environments. highlights the importance of this approach, noting that it prepares data scientists for future changes in popular programming languages 2.

    Whatever you type at the command line can also be turned into a command line tool. It's conceptually similar to writing a function in a programming language.

    ---

    This adaptability ensures that data scientists remain effective regardless of the evolving technological landscape.

       

    Polyglot Integration

    The command line acts as a platform for integrating multiple programming languages, making it an essential tool for polyglot data scientists. emphasizes that the command line's ability to handle text-based input and output allows seamless interaction between tools written in different languages 3. This capability is crucial for data scientists who need to leverage specialized packages across various languages without being constrained by language limitations. points out that this approach not only enhances efficiency but also future-proofs a data scientist's skill set 1.

    The command line is the ultimate melting pot across all of the programming languages.

    ---

    By embracing a polyglot mindset, data scientists can maximize their productivity and adaptability.

       

    Toolbox Expansion

    Command line environments allow data scientists to expand their toolboxes beyond traditional programming languages, offering a vast array of optimized tools. notes that while new tools are constantly being developed, the key is becoming comfortable with the command line environment itself 2. This familiarity enables data scientists to effectively stitch together various tools to solve complex tasks. underscores the enduring relevance of command line skills, likening them to a career-long investment that remains valuable despite the changing landscape of data science tools 4.

    Being comfortable at the command line in a programming language like Bash is a career long investment.

    ---

    This adaptability ensures that data scientists can navigate and leverage the ever-evolving array of tools available to them.

Related Episodes