Machine Learning: Linear Regression Assumptions

2 min readApr 16, 2023

Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables.

It assumes that the relationship between the dependent variable and the independent variable(s) can be represented by a straight line. The goal of linear regression is to find the best-fitting line that describes the relationship between the variables.

Assumptions of Linear Regression:

Linearity: The relationship between the dependent variable and the independent variable(s) should be linear. This means that the line of best fit should be a straight line.
Independence of errors: The errors (deviations of the observed values from the predicted values) should be independent of each other. This means that the observations should be independent of each other.
Homoscedasticity: The variance of the errors should be constant across all levels of the independent variable(s). This means that the spread of the residuals should be roughly the same for all values of the independent variable(s).
Normality of residuals: The errors should be normally distributed. This means that the distribution of the residuals should be roughly symmetrical around zero.
No multicollinearity: The independent variables should be uncorrelated with each other.
Outliers, influential values, and leverage points: Linear regression is sensitive to outliers, influential values, and leverage points

Machine Learning: Linear Regression Assumptions

Written by Rahul S