Machine Learning: Data Drift and Concept Drift

This article explores the concepts of data drift and concept drift in machine learning models. It discusses the challenges posed by these drifts and provides strategies for monitoring and mitigating their effects.

Rahul S
6 min readOct 7, 2023

DATA DRIFT

Data drift refers to changes in the input data used for modeling.

This can happen due to various factors such as changes in the data sources, environmental factors, or user behavior.

CHANGE IN DATA SOURCE: Consider a machine learning model that predicts customer churn based on historical data. If the source of the data changes, such as a new system being implemented to collect customer information, the input data used for training the model may no longer accurately represent the current state of the customers. This change in the input data can lead to data drift.

ENVIRONMENTAL FACTORS: Data drift can also occur when there are changes in the environment in which the data is collected. For instance, if the data is collected from sensors in a manufacturing plant and the plant undergoes renovations or upgrades, the sensor readings may be affected, resulting in changes in the input data.

--

--