Published Jul 19, 2019

AI code that facilitates good science

Explore the intersection of AI and reproducible science with Joel Grus, as he delves into his journey from finance to developing the influential AllenNLP library, underscoring the critical role of unit tests and clear code structures in ensuring reliability and bridging academic and corporate research.

Episode Highlights

Topics covered

Episode Highlights

Best Practices

Joel Grus emphasizes the importance of writing reproducible research code, which is crucial for ensuring scientific integrity. He suggests writing unit tests even for research code, as they help verify that models function as intended. Grus explains that separating library code from experiment code simplifies running experiments and avoids confusion over code versions.

If your model is not doing what you think it's doing, I mean, that's bad science out of the gate.

---

Additionally, he highlights the need for clear instructions and proper dependency management to facilitate reproducibility 1 2.

Testing

Testing AI models presents unique challenges due to their inherent non-determinism, but Joel Grus offers strategies to address this. He suggests focusing on invariants that don't depend on randomness, such as ensuring models run without crashing and produce outputs of the correct shape. Grus also recommends using small datasets to verify that models can learn perfectly, providing confidence in their functionality.

You can come up with these tests in a way that give you some confidence that the model is doing what it's supposed to do.

---

By structuring tests to account for randomness, researchers can ensure robust model performance 3 2.

Related Episodes

Applied NLP solutions & AI education
Answers 383 questions
Low code, no code, accelerated code, & failing code
Answers 383 questions
AI's impact on developers
Answers 383 questions
Open source data labeling tools
Answers 383 questions
AI in the majority world and model distillation
Answers 383 questions
Should kids still learn to code?
Answers 383 questions
AI adoption in the enterprise
Answers 383 questions
Data science for intuitive user experiences
Answers 383 questions
From symbols to AI pair programmers 💻
Answers 383 questions
Testing ML systems
Answers 383 questions
AI-powered scientific exploration and discovery
Answers 383 questions
The influence of open source on AI development
Answers 383 questions
On being humAIn
Answers 383 questions
Explainable AI that is accessible for all humans
Answers 383 questions
AI in the browser
Answers 383 questions

Dexa/Practical AI

AI code that facilitates good science

Topics covered

Popular Clips

Math to Code

Simplifying NLP with LNLP

Streamlining AI Experiments

Human Connection Insights

Open Source Challenges

Understanding Data Science

Team Innovations

Exploring Brain Science

AI Insights

AI Insights

Data Science Journey

Jupyter Notebooks Insight

NLP Library Insights

Reproducible Research Code

Episode Highlights

Writing Reproducible Code

Best Practices

Testing

Data Science Journey

Allen NLP Contributions

Related Episodes

Applied NLP solutions & AI education

Low code, no code, accelerated code, & failing code

AI's impact on developers

Open source data labeling tools

AI in the majority world and model distillation

Should kids still learn to code?

AI adoption in the enterprise

Data science for intuitive user experiences

From symbols to AI pair programmers 💻

Testing ML systems

AI-powered scientific exploration and discovery

The influence of open source on AI development

On being humAIn

Explainable AI that is accessible for all humans

AI in the browser

AI code that facilitates good science

Topics covered

Popular Clips

Episode Highlights

Writing Reproducible CodeJoel Grus shares insights on writing reproducible AI code and effective testing strategies. He highlights the importance of unit tests, clear code structure, and managing dependencies to ensure scientific reliability.

Writing Reproducible Code

Best Practices

Testing

Data Science JourneyJoel Grus, a senior research engineer at the Allen Institute for AI, shares his journey from math and finance to data science and AI. He discusses his role in developing the AllenNLP library and the importance of reproducibility in AI research.

Data Science Journey

Allen NLP ContributionsAllen NLP, a project spearheaded by Joel Grus, aims to simplify natural language processing tasks for researchers by providing high-level abstractions. It supports both academic and corporate research, bridging the gap between these sectors with its versatile applications.

Allen NLP Contributions

Related Episodes