Understanding Machine Learning
Machine learning produces complex systems like neural networks that operate without explicit design, raising questions about their inner workings. Neel discusses the challenges of mechanistic interpretability, emphasizing the potential to uncover human-comprehensible structures within these models. He also explores the nature of reasoning, questioning whether it requires intentionality and how it manifests in large language models.In this clip
From this podcast

Machine Learning Street Talk (MLST)
Neel Nanda - Mechanistic Interpretability (Sparse Autoencoders)
Related Questions