Understanding Loss Surface Geometry in Deep Learning Models
The training of deep neural networks unfolds as an optimization journey through a high-dimensional landscape—the loss surface—where each point represents a particular configuration of millions or billions of parameters, and the height represents the model’s error on the training data. This landscape’s geometry fundamentally determines whether gradient descent finds good solutions, how quickly training converges, … Read more