Uncovering Model Patterns
Josh discusses the complexity of large language models, comparing them to a black box with multiple patterns of lights. By applying dictionary learning, he reveals how these patterns represent concepts like French answers or physics nouns within the model's neurons. Scaling up this technique poses a challenge worth exploring further.In this clip
From this podcast

Hard Fork
Google Eats Rocks | EP 85
Related Questions