Influence Functions Explained

Laura explores the concept of influence functions, a robust statistical tool that helps analyze how removing specific training data impacts model behavior. By comparing factual retrieval tasks with zero-shot reasoning prompts, she highlights the distinct approaches models take when retrieving facts versus generating reasoning steps, revealing the complexity of reasoning processes.