The loss function quantifies the disparity between a model’s predictions and the actual ground truth labels.

At its core, the loss function measures how well a machine learning model is doing in terms of making accurate predictions. In classification tasks, the goal is to assign a label or class to an input data point.

The loss function helps us quantify the “cost” associated with the model’s predictions.

Specifically, it evaluates how far off the model’s prediction is from the actual target label.

The loss function is a mathematical function that takes as input

- the model’s predicted probabilities for each class and
- the true class label.

It then produces a scalar value that indicates the quality of the model’s prediction.

The cross-entropy loss, often used for classification tasks, is insightful for interpretation.

The cross-entropy loss can be mathematically expressed as follows:

In this expression:

`y_i`

represents the true probability distribution of the classes (i.e., the true labels).`p_i`

represents the predicted probability distribution of the classes generated by the model.

The cross-entropy loss measures the dissimilarity between the predicted probabilities (`p_i`

) and the true probabilities (`y_i`

).

The loss is essentially an information measure. It quantifies how well the model’s predictions align with the actual probabilities of class occurrences.

When the predicted and true probabilities match perfectly, the loss is zero, indicating perfect alignment.

**Logarithmic Scale:** The use of logarithms in the expression magnifies differences between predicted and true probabilities. When the model is confident and correct (i.e., high `p_i`

for the true class), the loss approaches zero. Conversely, if the model is uncertain or incorrect, the loss increases significantly.

I suggest that you read another piece to go a little deeper: