Schematic illustration of partial least squares regression. Two blocks of data, X and Y, are projected by w and c onto latent components, t and u, and least squares regression is performed. p and q represent loading vectors. https://doi.org/10.1371/journal.pone.0179638.g001

Partial Least Squares (PLS): Taming High-Dimensional Data and Capturing Complex Relationships

3 min readAug 6, 2023

Partial Least Squares (PLS) regression is a multivariate statistical technique used for modeling the relationships between predictor variables (X) and a response variable (Y).

It is particularly useful when dealing with datasets that have high dimensionality, multicollinearity, or noisy variables. PLS aims to find a set of latent variables, called components, that capture the most important information from both X and Y.

Mathematics of PLS Regression:

Let’s consider a scenario with N observations and p predictor variables (attributes) in X, and a single response variable Y. PLS constructs a set of orthogonal components, which are linear combinations of the original predictor variables:

Weights Calculation: PLS starts by finding a weight vector wk that maximizes the covariance between X and Y. This is achieved through iterations of weight updates.
Scores Calculation: Once wk is determined, the scores tk are calculated by projecting X onto wk.
Residual Calculation: The residuals Ek (error) between Y and the scores tk are calculated.
Loading Vector Calculation: The loading vector ck is calculated by regressing…

Partial Least Squares (PLS): Taming High-Dimensional Data and Capturing Complex Relationships

Mathematics of PLS Regression:

Written by Rahul S