Model Training Insights
Stella shares how they trained their model for 400 billion tokens, aligning with recent research findings. Despite initial methodological flaws, continuous evaluations showed steady performance improvements, raising questions about resource allocation.In this clip
From this podcast

Gradient Dissent - A Machine Learning Podcast
How EleutherAI Trains and Releases LLMs: Interview with Stella Biderman
Related Questions