MLpp
ml::LinearRegression Namespace Reference

Linear regression algorithms. More...

Classes

struct  LassoRegressionResult
 Result of a multivariate Lasso regression with intercept.
More...
 
struct  MultivariateOLSResult
 Result of multivariate Ordinary Least Squares regression.
More...
 
class  RecursiveMultivariateOLS
 Recursive multivariate Ordinary Least Squares. More...
 
struct  RegularisedRegressionResult
 Result of a multivariate regularised regression with intercept. More...
 
struct  Result
 Result of linear regression. More...
 
struct  RidgeRegressionResult
 Result of a multivariate ridge regression with intercept.
More...
 
struct  UnivariateOLSResult
 Result of 1D Ordinary Least Squares regression (with or without intercept). More...
 

Functions

UnivariateOLSResult univariate (Eigen::Ref< const Eigen::VectorXd > x, Eigen::Ref< const Eigen::VectorXd > y)
 Carries out univariate (aka simple) linear regression with intercept. More...
 
UnivariateOLSResult univariate (double x0, double dx, Eigen::Ref< const Eigen::VectorXd > y)
 Carries out univariate (aka simple) linear regression with intercept on regularly spaced points. More...
 
UnivariateOLSResult univariate_without_intercept (Eigen::Ref< const Eigen::VectorXd > x, Eigen::Ref< const Eigen::VectorXd > y)
 Carries out univariate (aka simple) linear regression without intercept. More...
 
MultivariateOLSResult multivariate (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y)
 Carries out multivariate linear regression. More...
 
template<bool DoStandardise>
RidgeRegressionResult ridge (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda)
 Carries out multivariate ridge regression with intercept. More...
 
template<>
RidgeRegressionResult ridge< true > (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda)
 Carries out multivariate ridge regression with intercept, standardising X inputs internally. More...
 
template<>
RidgeRegressionResult ridge< false > (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda)
 Carries out multivariate ridge regression with intercept, assuming standardised X inputs. More...
 
RidgeRegressionResult ridge (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda, bool do_standardise)
 Carries out multivariate ridge regression with intercept, allowing the user switch internal standardisation of X data on or off. More...
 
template<bool DoStandardise>
LassoRegressionResult lasso (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda)
 Carries out multivariate Lasso regression with intercept. More...
 
template<>
LassoRegressionResult lasso< true > (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda)
 Carries out multivariate Lasso regression with intercept, standardising X inputs internally. More...
 
template<>
LassoRegressionResult lasso< false > (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda)
 Carries out multivariate Lasso regression with intercept, assuming standardised X inputs. More...
 
LassoRegressionResult lasso (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda, bool do_standardise)
 Carries out multivariate Lasso regression with intercept, allowing the user switch internal standardisation of X data on or off. More...
 
template<class Regression >
double press (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, Regression regression)
 Calculates the PRESS statistic (Predicted Residual Error Sum of Squares). More...
 
template<bool WithIntercept>
double press_univariate (Eigen::Ref< const Eigen::VectorXd > x, Eigen::Ref< const Eigen::VectorXd > y)
 Calculates the PRESS statistic (Predicted Residual Error Sum of Squares) for univariate regression. More...
 
Eigen::MatrixXd add_ones (Eigen::Ref< const Eigen::MatrixXd > X)
 Adds another row with 1s in every column to X. More...
 
void standardise (Eigen::Ref< Eigen::MatrixXd > X)
 Standardises independent variables. More...
 
void standardise (Eigen::Ref< Eigen::MatrixXd > X, Eigen::VectorXd &means, Eigen::VectorXd &standard_deviations)
 Standardises independent variables. More...
 
void unstandardise (Eigen::Ref< Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > means, Eigen::Ref< const Eigen::VectorXd > standard_deviations)
 Reverses the outcome of standardise(). More...
 

Detailed Description

Linear regression algorithms.

For multivariate regression we depart from the textbook convention and assume that independent variables X are laid out columnwise, i.e., data points are in columns.

Function Documentation

◆ univariate() [1/2]

UnivariateOLSResult ml::LinearRegression::univariate ( Eigen::Ref< const Eigen::VectorXd >  x,
Eigen::Ref< const Eigen::VectorXd >  y 
)

Carries out univariate (aka simple) linear regression with intercept.

Parameters
[in]xX vector.
[in]yY vector.
Returns
UnivariateOLSResult object.
Exceptions
std::invalid_argumentIf x and y have different sizes, or if their size is less than 2.

◆ univariate() [2/2]

UnivariateOLSResult ml::LinearRegression::univariate ( double  x0,
double  dx,
Eigen::Ref< const Eigen::VectorXd >  y 
)

Carries out univariate (aka simple) linear regression with intercept on regularly spaced points.

Parameters
[in]x0First X value.
[in]dxPositive X increment.
[in]yY vector.
Returns
UnivariateOLSResult object.
Exceptions
std::invalid_argumentIf y.size() < 2.
std::domain_errorIf dx <= 0.

◆ univariate_without_intercept()

UnivariateOLSResult ml::LinearRegression::univariate_without_intercept ( Eigen::Ref< const Eigen::VectorXd >  x,
Eigen::Ref< const Eigen::VectorXd >  y 
)

Carries out univariate (aka simple) linear regression without intercept.

Parameters
[in]xX vector.
[in]yY vector.
Returns
UnivariateOLSResult object with intercept, var_intercept and cov_slope_intercept set to 0.
Exceptions
std::invalid_argumentIf x and y have different sizes, or if their size is less than 1.

◆ multivariate()

MultivariateOLSResult ml::LinearRegression::multivariate ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y 
)

Carries out multivariate linear regression.

Given X and y, finds \( \vec{\beta} \) minimising \( \lVert \vec{y} - X^T \vec{\beta} \rVert^2 \).

If fitting with intercept is desired, include a row of 1's in the X values.

Parameters
[in]XD x N matrix of X values, with data points in columns.
[in]yY vector with length N.
Returns
MultivariateOLSResult object.
Exceptions
std::invalid_argumentIf y.size() != X.cols() or X.cols() < X.rows().
See also
add_ones()

◆ ridge() [1/2]

template<bool DoStandardise>
RidgeRegressionResult ml::LinearRegression::ridge ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y,
double  lambda 
)

Carries out multivariate ridge regression with intercept.

Given X and y, finds \( \vec{\beta'} \) and \( \beta_0 \) minimising \( \lVert \vec{y} - X^T \vec{\beta'} - \beta_0 \rVert^2 + \lambda \lVert \vec{\beta'} \rVert^2 \), where \( \vec{\beta'} \) and \( \beta_0 \) are concatenated as RidgeRegressionResult::beta in the returned RidgeRegressionResult object.

The matrix X is either assumed to be standardised (DoStandardise == false) or is standardised internally (DoStandardise == true; requires a matrix copy).

Parameters
[in]XD x N matrix of X values, with data points in columns. Should NOT contain a row with all 1's.
[in]yY vector with length N.
[in]lambdaRegularisation strength.
Template Parameters
DoStandardiseWhether to standardise X internally.
Returns
RidgeRegressionResult object with beta.size() == X.rows() + 1. If DoStandardise == true, beta will be rescaled and shifted to original X units and origins, and cov will be transformed accordingly.
Exceptions
std::invalid_argumentIf y.size() != X.cols() or X.cols() < X.rows().
std::domain_errorIf lambda < 0.
See also
standardise()

◆ ridge< true >()

template<>
RidgeRegressionResult ml::LinearRegression::ridge< true > ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y,
double  lambda 
)

Carries out multivariate ridge regression with intercept, standardising X inputs internally.

See also
ridge().

◆ ridge< false >()

template<>
RidgeRegressionResult ml::LinearRegression::ridge< false > ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y,
double  lambda 
)

Carries out multivariate ridge regression with intercept, assuming standardised X inputs.

See also
ridge().

◆ ridge() [2/2]

RidgeRegressionResult ml::LinearRegression::ridge ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y,
double  lambda,
bool  do_standardise 
)
inline

Carries out multivariate ridge regression with intercept, allowing the user switch internal standardisation of X data on or off.

See also
ridge().

◆ lasso() [1/2]

template<bool DoStandardise>
LassoRegressionResult ml::LinearRegression::lasso ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y,
double  lambda 
)

Carries out multivariate Lasso regression with intercept.

Given X and y, finds \( \vec{\beta'} \) and \( \beta_0 \) minimising \( \lVert \vec{y} - X^T \vec{\beta'} - \beta_0 \rVert^2 + \lambda \lVert \vec{\beta'} \rVert^1 \), where \( \vec{\beta'} \) and \( \beta_0 \) are concatenated as LassoRegressionResult::beta in the returned LassoRegressionResult object.

The matrix X is either assumed to be standardised (DoStandardise == false) or is standardised internally (DoStandardise == true; requires a matrix copy).

Uses the iterated ridge regression method of Fan and Li (2001).

Parameters
[in]XD x N matrix of X values, with data points in columns. Should NOT contain a row with all 1's.
[in]yY vector with length N.
[in]lambdaRegularisation strength.
Template Parameters
DoStandardiseWhether to standardise X internally.
Returns
LassoRegressionResult object with beta.size() == X.rows() + 1. If DoStandardise == true, beta will be rescaled and shifted to original X units and origins, and cov will be transformed accordingly.
Exceptions
std::invalid_argumentIf y.size() != X.cols() or X.cols() < X.rows().
std::domain_errorIf lambda < 0.
See also
standardise()

◆ lasso< true >()

template<>
LassoRegressionResult ml::LinearRegression::lasso< true > ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y,
double  lambda 
)

Carries out multivariate Lasso regression with intercept, standardising X inputs internally.

See also
lasso().

◆ lasso< false >()

template<>
LassoRegressionResult ml::LinearRegression::lasso< false > ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y,
double  lambda 
)

Carries out multivariate Lasso regression with intercept, assuming standardised X inputs.

See also
lasso().

◆ lasso() [2/2]

LassoRegressionResult ml::LinearRegression::lasso ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y,
double  lambda,
bool  do_standardise 
)
inline

Carries out multivariate Lasso regression with intercept, allowing the user switch internal standardisation of X data on or off.

See also
lasso().

◆ press()

template<class Regression >
double ml::LinearRegression::press ( Eigen::Ref< const Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  y,
Regression  regression 
)

Calculates the PRESS statistic (Predicted Residual Error Sum of Squares).

See https://en.wikipedia.org/wiki/PRESS_statistic for details.

Warning
When calculating PRESS for regularised OLS, regression must standardise the data internally (call ridge() with DoStandardise == true).
Template Parameters
RegressionFunctor type implementing particular regression.
Parameters
[in]XD x N matrix of X values, with data points in columns.
[in]yY vector with length N.
[in]regressionRegression functor. regression(X, y) should return a result object supporting a predict(X) call (e.g. MultivariateOLSResult). Must standardise the data internally if necessary.
Returns
Value of the PRESS statistic.
Exceptions
std::invalid_argumentIf X.cols() != y.size().

◆ press_univariate()

template<bool WithIntercept>
double ml::LinearRegression::press_univariate ( Eigen::Ref< const Eigen::VectorXd >  x,
Eigen::Ref< const Eigen::VectorXd >  y 
)

Calculates the PRESS statistic (Predicted Residual Error Sum of Squares) for univariate regression.

See https://en.wikipedia.org/wiki/PRESS_statistic for details.

Parameters
[in]xX vector with length N.
[in]yY vector with same length as x.
Template Parameters
WithInterceptWhether the regression is with intercept or not.
Returns
Value of the PRESS statistic.
Exceptions
std::invalid_argumentIf x.size() != y.size().

◆ add_ones()

Eigen::MatrixXd ml::LinearRegression::add_ones ( Eigen::Ref< const Eigen::MatrixXd >  X)

Adds another row with 1s in every column to X.

Parameters
[in]XMatrix of independent variables with data points in columns.
Returns
New matrix with a row filled with 1's added at the end.
Exceptions
std::invalid_argumentIf X.cols() == 0.

◆ standardise() [1/2]

void ml::LinearRegression::standardise ( Eigen::Ref< Eigen::MatrixXd >  X)

Standardises independent variables.

From each row, standardise subtracts its mean and divides it by its standard deviation.

Parameters
[in,out]XMatrix of independent variables with data points in columns.
Exceptions
std::invalid_argumentIf any row of X has all values the same, or X is empty.

◆ standardise() [2/2]

void ml::LinearRegression::standardise ( Eigen::Ref< Eigen::MatrixXd >  X,
Eigen::VectorXd &  means,
Eigen::VectorXd &  standard_deviations 
)

Standardises independent variables.

From each row, standardise subtracts its mean and divides it by its standard deviation.

This version of standardise saves original mean and standard deviation for every row in provided vectors.

Parameters
[in,out]XD x N matrix of independent variables with data points in columns.
[out]meansAt exit has length D and contains means of rows of X.
[out]standard_deviationsAt exit has length D and contains standard deviations of rows of X. If means and standard_deviations refer to the same vector, at exit this vector will contain the standard deviations.
Exceptions
std::invalid_argumentIf any row of X has all values the same, or X is empty.

◆ unstandardise()

void ml::LinearRegression::unstandardise ( Eigen::Ref< Eigen::MatrixXd >  X,
Eigen::Ref< const Eigen::VectorXd >  means,
Eigen::Ref< const Eigen::VectorXd >  standard_deviations 
)

Reverses the outcome of standardise().

From each row, standardise multiplies it by its standard deviation and adds its mean.

Parameters
[in,out]XD x N matrix of standardised independent variables with data points in columns.
[in]meansMeans of rows of X.
[in]standard_deviationsStandard deviations of rows of X.
Exceptions
std::invalid_argumentIf X.rows() != means.size() or X.rows() != standard_deviations.size().
std::domain_errorIf any element of standard_deviations is not positive.