Deep Learning: Activation Functions — 10 Tricky questions

1. What are some common activation functions used in deep learning, and how do they differ from each other?

4 min readAug 17, 2023

Answer: Some common activation functions used in deep learning include the sigmoid, hyperbolic tangent (tanh), rectified linear unit (ReLU), and softmax functions.

The sigmoid and tanh functions are both smooth and saturate as their input gets very large or small.
The ReLU function is non-saturating and has a simple derivative, making it computationally efficient.
The softmax function is used for multiclass classification problems and converts a vector of inputs into a probability distribution.

2. ReLU is non-saturating. What does it mean? How is it useful?

Answer: ReLU (Rectified Linear Unit) is considered “non-saturating” because it does not saturate at higher input values. As the input gets larger, the output of the function remains proportional to the input, rather than saturating at a maximum value.

The non-saturating property of ReLU helps to prevent the vanishing gradient problem, which can occur when gradients become too small (like in case of tanh and sigmoid) and lead…

Deep Learning: Activation Functions — 10 Tricky questions

1. What are some common activation functions used in deep learning, and how do they differ from each other?

2. ReLU is non-saturating. What does it mean? How is it useful?

Written by Rahul S