gradient Archives - ML Journey

Gradient Computation in Deep Learning: The Engine Behind Neural Network Training

December 26, 2025 by Peter Song

Every time a neural network learns to recognize a face, translate a sentence, or predict stock prices, gradient computation is working behind the scenes. This fundamental mechanism is what transforms a randomly initialized network into a powerful prediction machine. Understanding gradient computation isn’t just an academic exercise—it’s the key to comprehending how deep learning actually … Read more

Difference Between Batch Gradient Descent and Mini-Batch in Noisy Datasets

December 20, 2025 by Peter Song

The fundamental challenge in training machine learning models on noisy datasets lies in distinguishing genuine patterns from random fluctuations—a task that becomes critically dependent on how gradient descent processes the training data. Batch gradient descent computes gradients using the entire dataset before each parameter update, providing a deterministic, stable signal that averages out noise across … Read more

Gradient Noise Scale and Batch Size Relationship

December 6, 2025 by Peter Song

When training neural networks, practitioners face a fundamental question that significantly impacts both model quality and training efficiency: what batch size should I use? The answer isn’t simply “as large as your GPU memory allows” or “stick with the default.” The relationship between batch size and gradient noise scale reveals deep insights into the optimization … Read more

Adaptive Gradient Descent: Enhancing Optimization

July 4, 2025December 14, 2024 by Peter Song

Training a machine learning model can be a lot like navigating a maze—you need to find the right path, and that path isn’t always obvious. One of the biggest challenges in this journey is figuring out the right learning rate. Too high, and your model may overshoot; too low, and it could take forever to … Read more