In Regularization we add a penalty term to the loss function during model training. This penalty discourages the model from learning overly complex patterns from the training data.

In machine learning, models aim to minimize a loss function that quantifies the error between their predictions and the actual target values in the training data.

Regularization introduces an additional term to the loss function, which is a function of the model’s parameters (weights and biases). T** his term penalizes large parameter values** and complexity in the model.

## Two Common Regularization Techniques:

a. L2 Regularization (Ridge Regression):

- L2 regularization adds a penalty term that is proportional to the square of the magnitude of the model’s parameters.
- The regularization term is typically expressed as
`λ * Σ(w_i^2)`

, where`λ`

is the regularization strength and`w_i`

are the model parameters. - By including this term in the loss function, the model is encouraged to keep the parameter values small, effectively reducing their impact on the predictions.
- This discourages the model from fitting noise.