Jerome Pesenti — Large Language Models, PyTorch, and Meta

Topics covered
Popular Clips
Questions from this episode
- Asked by 84 people
- Asked by 59 people
Episode Highlights
Initial Development
The development of PyTorch at Meta was marked by strategic decisions and challenges. recalls the initial dual-path strategy with PyTorch, Caffe2, and ONNX, which he deemed unsustainable. He advocated for a single framework with community support, leading to PyTorch's adoption despite its initial lack of production readiness 1. Jerome highlights the importance of user-centric design in PyTorch's success, contrasting it with TensorFlow's retrofitted approach 2.
PyTorch was a rising star, but not production ready. And really the only one that had all this aspect was TensorFlow at the time.
---
This decision has since paid off, with PyTorch becoming a beloved tool for both research and production.
Community Support
Community support played a crucial role in PyTorch's evolution and adoption. Jerome emphasizes that the framework's design was inherently user-friendly, which resonated with researchers and developers alike 2. He notes that choosing a technology with strong community backing prevents stagnation, as seen with other systems lacking such support 3.
It's really about user friendliness, research and friendliness, actually.
---
This community-driven approach ensured PyTorch's continued relevance and growth in the AI landscape.
Technical Superiority
PyTorch's technical superiority is attributed to its user-centric design and adaptability. Jerome points out that PyTorch's dynamic graph generation and ease of use set it apart from TensorFlow, which struggled with a more rigid structure 2. The framework's flexibility required early optimization to meet production demands, balancing community needs with internal requirements 4.
The challenge with something like PyTorch is that you need to do early optimization, you don't have a way around it.
---
This balance has allowed PyTorch to thrive as a preferred choice for AI research and application.
Related Episodes


Richard Socher — The Challenges of Making ML Work in the Real World
Answers 383 questions

Shaping AI Benchmarks with Together AI Co-Founder Percy Liang
Answers 383 questions

Dave Rogenmoser & Saad Ansari on Growing & Maintaining Jasper AI
Answers 383 questions

Clément Delangue — The Power of the Open Source Community
Answers 383 questions

Transforming Search with Perplexity AI’s CTO Denis Yarats
Answers 383 questions

The Power of AI in Search with You.com's Richard Socher
Answers 383 questions

Emily M. Bender — Language Models and Linguistics
Answers 383 questions

How EleutherAI Trains and Releases LLMs: Interview with Stella Biderman
Answers 383 questions

Advanced AI Accelerators and Processors with Andrew Feldman of Cerebras Systems
Answers 383 questions

Jensen Huang — NVIDIA's CEO on the Next Generation of AI and MLOps
Answers 383 questions

Jeremy Howard of fast.ai— The Simple but Profound Insight Behind Diffusion
Answers 383 questions

Cade Metz — The Stories Behind the Rise of AI
Answers 383 questions

Autonomous Mobile Robot Deployment: Interview with Jean Marc Alkazzi at idealworks
Answers 383 questions

Jeremy Howard — The Story of fast.ai and Why Python Is Not the Future of ML
Answers 383 questions














