Neural Network Optimization
Simon discusses the efficiency of stochastic gradient descent in training large neural networks and its regularization effects. The conversation delves into the history and implications of batch normalization, highlighting its unexpected regularization properties and adaptations in modern deep learning architectures.In this clip
From this podcast

Machine Learning Street Talk (MLST)
Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]
Related Questions