Machine Learning: Bagging and Boosting

Rahul S
3 min readSep 6, 2023

--

Bagging and boosting are both ensemble machine learning techniques used to improve the performance of weak learners (often decision trees) by combining their predictions.

However, they differ in their approach and the way they handle training data and errors. Let’s explore the differences between bagging and boosting with detailed examples:

BAGGING (BOOTSTRAP AGGREGATION)

  1. Bootstrap Sampling: In Bagging, multiple subsets of the training dataset are created through bootstrap sampling. For each subset, we randomly sample data points with replacement from the original dataset. Each subset is roughly the same size as the original dataset but contains some duplicate data points and omits others.
  2. Parallel Training: Each subset is used to train a separate base model. These base models are trained independently in parallel. Since each subset is slightly different, the base models capture different patterns or errors.
  3. Voting or Averaging: During inference, all base models make individual predictions. The final prediction is determined through majority voting (for classification) or averaging (for…

--

--