Machine Learning: Regularization for Overfitting

Rahul S
2 min readAug 23, 2023

In Regularization we add a penalty term to the loss function during model training. This penalty discourages the model from learning overly complex patterns from the training data.

In machine learning, models aim to minimize a loss function that quantifies the error between their predictions and the actual target values in the training data.

Regularization introduces an additional term to the loss function, which is a function of the model’s parameters (weights and biases). This term penalizes large parameter values and complexity in the model.

Two Common Regularization Techniques:

a. L2 Regularization (Ridge Regression):

  • L2 regularization adds a penalty term that is proportional to the square of the magnitude of the model’s parameters.
  • The regularization term is typically expressed as λ * Σ(w_i^2), where λ is the regularization strength and w_i are the model parameters.
  • By including this term in the loss function, the model is encouraged to keep the parameter values small, effectively reducing their impact on the predictions.
  • This discourages the model from fitting noise.