**(1) VANISHING GRADIENT**: One of the main drawbacks of the sigmoid activation function is that it **saturates at high and low input values**. This means that as the input to the sigmoid function becomes very large or very small, the output of the function becomes very close to either 0 or 1.

This can lead to the gradients becoming very small and can make it difficult to train deep learning networks using the sigmoid activation function. This is known as the **“vanishing gradient”** problem.

**(2) NOT ZERO-CENTERED**: Another drawback of the sigmoid activation function is that its output is not zero-centered. This means that the **activations of the neurons in the network can be all positive or all negative,** which can cause problems when using certain optimization algorithms, such as gradient descent.

**(3) COMPUTATIONALLY INEFFICIENT**: Finally, the sigmoid activation function is **not as computationally efficient as other activation functions**, such as the ReLU activation function. This can be a concern when training large deep learning networks.