Multilingual RL Training
Sara and Tim discuss using multilingual data in reinforcement learning training, highlighting the challenges of translation artifacts and how they overcame them by leveraging a high-performing model to generate synthetic pairs. Their innovative approach steered the model away from translation errors, leading to more reliable results in the RLHF project.In this clip
From this podcast

Machine Learning Street Talk (MLST)
Sara Hooker - Why US AI Act Compute Thresholds Are Misguided
Related Questions