Published Jan 20, 2025

How Do AI Models Actually Think? - Laura Ruis

Researcher Laura Ruis delves into AI agency, reasoning, and scaling challenges, unraveling the nuances of control, ethical implications, and the integration of symbolic computation within connectionist frameworks to enhance procedural knowledge and formal reasoning in language models.
Episode Highlights
Machine Learning Street Talk (MLST) logo

Popular Clips

Questions from this episode

Episode Highlights

  • Factual vs. Reasoning

    The distinction between factual retrieval and reasoning tasks in language models is crucial for understanding their capabilities. explains that factual retrieval relies on specific documents, whereas reasoning tasks synthesize knowledge from multiple sources, demonstrating procedural knowledge 1. This synthesis allows models to perform complex tasks like arithmetic and linear equations, which require abstract reasoning 2.

    The important point is that it is seemingly taking knowledge from many different documents and applying it to the same task.

    ---

    Influence functions help analyze how pre-training data affects these reasoning steps, revealing that reasoning involves a more diffused approach compared to the focused retrieval of factual information 3.

       

    Formal Reasoning

    Exploring the potential for formal reasoning in language models, suggests that connectionist models can learn systematic rules and achieve high accuracy on novel problems 4. This capability indicates that models might handle symbolic computation, although challenges remain in dealing with entirely new tokens.

    We have shown that they can do a form of systematicity or symbolic computation, although it's still limited.

    ---

    The debate continues on whether scaling current approaches will yield better results or if new methods are needed to enhance data efficiency and adaptability 5.

       

    Role of Code

    The inclusion of code in training data significantly influences language models' reasoning abilities. notes that code provides a robust framework for models to learn step-by-step reasoning, enhancing their ability to generalize across tasks 6. This abstraction allows models to handle diverse expressions of the same problem, making them more adaptable.

    It seems like the model can learn to do these step-by-step reasoning traces to output them from descriptions of procedures in code.

    ---

    Interestingly, code influences reasoning both positively and negatively, highlighting the complexity of its role in model training 7.

Related Episodes