Machine Learning: Interpretation of Loss Function with Cross-Entropy Loss

Rahul S
2 min readOct 2, 2023

The loss function quantifies the disparity between a model’s predictions and the actual ground truth labels.

At its core, the loss function measures how well a machine learning model is doing in terms of making accurate predictions. In classification tasks, the goal is to assign a label or class to an input data point.

The loss function helps us quantify the “cost” associated with the model’s predictions.

Specifically, it evaluates how far off the model’s prediction is from the actual target label.

The loss function is a mathematical function that takes as input

  • the model’s predicted probabilities for each class and
  • the true class label.

It then produces a scalar value that indicates the quality of the model’s prediction.

The cross-entropy loss, often used for classification tasks, is insightful for interpretation.

The cross-entropy loss can be mathematically expressed as follows:

In this expression:

  • y_i represents the true probability distribution of the classes (i.e., the true labels).
  • p_i represents the predicted probability distribution of the classes generated by the model.

The cross-entropy loss measures the dissimilarity between the predicted probabilities (p_i) and the true probabilities (y_i).

The loss is essentially an information measure. It quantifies how well the model’s predictions align with the actual probabilities of class occurrences.

When the predicted and true probabilities match perfectly, the loss is zero, indicating perfect alignment.

Logarithmic Scale: The use of logarithms in the expression magnifies differences between predicted and true probabilities. When the model is confident and correct (i.e., high p_i for the true class), the loss approaches zero. Conversely, if the model is uncertain or incorrect, the loss increases significantly.

I suggest that you read another piece to go a little deeper: