AdaBoost (Adaptive Boosting) is a binary classification ensemble learning algorithm that focuses on data points misclassified by the current ensemble.
It iteratively trains weak classifiers, assigns weights to data points, and combines them into a strong classifier. Its key steps:
- Initialization: Set equal weights for all training data points.
- Iterative Training: For each iteration, train a weak classifier, often a decision stump, on weighted data. It emphasizes misclassified data from the previous round.
- Classifier Weight Calculation: Calculate the weighted error of the weak classifier and compute its weight in the ensemble. Higher weight for better performance.
- Weight Update: Adjust data point weights based on the correctness of classification by the current weak classifier. Increase weights for misclassified points, decrease for correct ones.
- Ensemble Creation: Combine weak classifiers into an ensemble, each with its weight.
- Normalization of Weights: Ensure data point weights form a probability distribution.
- Final Classification: Compute the final prediction as the sign of the weighted sum of predictions from weak classifiers.
- Repeat Iterations: Continue iterations until a preset limit or desired accuracy is reached.
AdaBoost’s strengths include improved accuracy, versatility in weak classifier choice, reduced overfitting, and simplicity in parameter tuning. However, it’s sensitive to noisy data, potential overfitting with too many weak classifiers, computational complexity, and potential bias toward complex weak classifiers. It also lacks direct probability estimates.
In practice, AdaBoost adapts to challenging data points, leading to strong classifiers, but it should be used judiciously based on the nature of the data and problem.