1.

What exactly do you mean by exploding and vanishing gradients?

Answer»

By taking INCREMENTAL steps towards the minimal value, the gradient descent algorithm aims to minimize the error. The weights and biases in a neural network are updated using these processes.

However, at times, the steps grow excessively large, resulting in increased updates to weights and bias TERMS — to the point where the weights overflow (or become NaN, that is, Not a Number). An exploding gradient is the result of this, and it is an unstable method.

On the other hand, if the steps are excessively SMALL, it results in minor – EVEN negligible – changes in the weights and bias terms. As a result, we may end up training a deep learning model with nearly identical weights and biases every TIME, never reaching the least error function. The vanishing gradient is what it's called.



Discussion

No Comment Found