Member-only story

Exploding / Vanishing Gradients

KEEP IN TOUCH

Rahul S

6 min readMar 10, 2024

How Does a Neural Network Learn?

KEEP IN TOUCH

ogre51.medium.com

Deep neural network training involves understanding exploding and vanishing gradients.

Exploding gradients become large, causing divergence, while vanishing gradients lead to slow convergence. These affect training times, convergence, and model performance.

Techniques like weight initialization, activation functions (e.g., ReLU), batch normalization, gradient clipping, and residual connections mitigate these issues.

When delving into the intricate world of training deep neural networks, it’s essential to grasp the concept of gradients and their role in optimizing models.

Gradients represent the rate of change of the loss function with respect to each weight in the network. These gradients guide the update process during training, ensuring that the model converges towards a solution that minimizes the loss.

Exploding / Vanishing Gradients

KEEP IN TOUCH

How Does a Neural Network Learn?

KEEP IN TOUCH

EXPLODING GRADIENTS:

Written by Rahul S

No responses yet