# Partial Least Squares (PLS): Taming High-Dimensional Data and Capturing Complex Relationships

Partial Least Squares (PLS) regression is a multivariate statistical technique used for modeling the relationships between predictor variables (X) and a response variable (Y).

It is particularly useful when dealing with datasets that have high dimensionality, multicollinearity, or noisy variables. PLS aims to find a set of latent variables, called components, that capture the most important information from both X and Y.

## Mathematics of PLS Regression:

Let’s consider a scenario with *N* observations and *p* predictor variables (attributes) in X, and a single response variable Y. PLS constructs a set of orthogonal components, which are linear combinations of the original predictor variables:

**Weights Calculation**: PLS starts by finding a weight vector*wk* that maximizes the covariance between*X*and*Y*. This is achieved through iterations of weight updates.**Scores Calculation**: Once*wk* is determined, the scores*tk* are calculated by projecting*X*onto*wk*.**Residual Calculation**: The residuals*Ek* (error) between*Y*and the scores*tk* are calculated.**Loading Vector Calculation**: The loading vector*ck* is calculated by regressing…