Deep Learning: Activation Functions — 10 Tricky questions

1. What are some common activation functions used in deep learning, and how do they differ from each other?

Rahul S
4 min readAug 17, 2023

--

Answer: Some common activation functions used in deep learning include the sigmoid, hyperbolic tangent (tanh), rectified linear unit (ReLU), and softmax functions.

  • The sigmoid and tanh functions are both smooth and saturate as their input gets very large or small.
  • The ReLU function is non-saturating and has a simple derivative, making it computationally efficient.
  • The softmax function is used for multiclass classification problems and converts a vector of inputs into a probability distribution.

2. ReLU is non-saturating. What does it mean? How is it useful?

Answer: ReLU (Rectified Linear Unit) is considered “non-saturating” because it does not saturate at higher input values. As the input gets larger, the output of the function remains proportional to the input, rather than saturating at a maximum value.

The non-saturating property of ReLU helps to prevent the vanishing gradient problem, which can occur when gradients become too small (like in case of tanh and sigmoid) and lead…

--

--