Linear regression algorithms. More...
Classes | |
struct | LassoRegressionResult |
Result of a multivariate Lasso regression with intercept. More... | |
struct | MultivariateOLSResult |
Result of multivariate Ordinary Least Squares regression. More... | |
class | RecursiveMultivariateOLS |
Recursive multivariate Ordinary Least Squares. More... | |
struct | RegularisedRegressionResult |
Result of a multivariate regularised regression with intercept. More... | |
struct | Result |
Result of linear regression. More... | |
struct | RidgeRegressionResult |
Result of a multivariate ridge regression with intercept. More... | |
struct | UnivariateOLSResult |
Result of 1D Ordinary Least Squares regression (with or without intercept). More... | |
Functions | |
UnivariateOLSResult | univariate (Eigen::Ref< const Eigen::VectorXd > x, Eigen::Ref< const Eigen::VectorXd > y) |
Carries out univariate (aka simple) linear regression with intercept. More... | |
UnivariateOLSResult | univariate (double x0, double dx, Eigen::Ref< const Eigen::VectorXd > y) |
Carries out univariate (aka simple) linear regression with intercept on regularly spaced points. More... | |
UnivariateOLSResult | univariate_without_intercept (Eigen::Ref< const Eigen::VectorXd > x, Eigen::Ref< const Eigen::VectorXd > y) |
Carries out univariate (aka simple) linear regression without intercept. More... | |
MultivariateOLSResult | multivariate (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y) |
Carries out multivariate linear regression. More... | |
template<bool DoStandardise> | |
RidgeRegressionResult | ridge (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda) |
Carries out multivariate ridge regression with intercept. More... | |
template<> | |
RidgeRegressionResult | ridge< true > (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda) |
Carries out multivariate ridge regression with intercept, standardising X inputs internally. More... | |
template<> | |
RidgeRegressionResult | ridge< false > (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda) |
Carries out multivariate ridge regression with intercept, assuming standardised X inputs. More... | |
RidgeRegressionResult | ridge (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda, bool do_standardise) |
Carries out multivariate ridge regression with intercept, allowing the user switch internal standardisation of X data on or off. More... | |
template<bool DoStandardise> | |
LassoRegressionResult | lasso (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda) |
Carries out multivariate Lasso regression with intercept. More... | |
template<> | |
LassoRegressionResult | lasso< true > (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda) |
Carries out multivariate Lasso regression with intercept, standardising X inputs internally. More... | |
template<> | |
LassoRegressionResult | lasso< false > (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda) |
Carries out multivariate Lasso regression with intercept, assuming standardised X inputs. More... | |
LassoRegressionResult | lasso (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, double lambda, bool do_standardise) |
Carries out multivariate Lasso regression with intercept, allowing the user switch internal standardisation of X data on or off. More... | |
template<class Regression > | |
double | press (Eigen::Ref< const Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > y, Regression regression) |
Calculates the PRESS statistic (Predicted Residual Error Sum of Squares). More... | |
template<bool WithIntercept> | |
double | press_univariate (Eigen::Ref< const Eigen::VectorXd > x, Eigen::Ref< const Eigen::VectorXd > y) |
Calculates the PRESS statistic (Predicted Residual Error Sum of Squares) for univariate regression. More... | |
Eigen::MatrixXd | add_ones (Eigen::Ref< const Eigen::MatrixXd > X) |
Adds another row with 1s in every column to X. More... | |
void | standardise (Eigen::Ref< Eigen::MatrixXd > X) |
Standardises independent variables. More... | |
void | standardise (Eigen::Ref< Eigen::MatrixXd > X, Eigen::VectorXd &means, Eigen::VectorXd &standard_deviations) |
Standardises independent variables. More... | |
void | unstandardise (Eigen::Ref< Eigen::MatrixXd > X, Eigen::Ref< const Eigen::VectorXd > means, Eigen::Ref< const Eigen::VectorXd > standard_deviations) |
Reverses the outcome of standardise(). More... | |
Linear regression algorithms.
For multivariate regression we depart from the textbook convention and assume that independent variables X are laid out columnwise, i.e., data points are in columns.
UnivariateOLSResult ml::LinearRegression::univariate | ( | Eigen::Ref< const Eigen::VectorXd > | x, |
Eigen::Ref< const Eigen::VectorXd > | y | ||
) |
Carries out univariate (aka simple) linear regression with intercept.
[in] | x | X vector. |
[in] | y | Y vector. |
std::invalid_argument | If x and y have different sizes, or if their size is less than 2. |
UnivariateOLSResult ml::LinearRegression::univariate | ( | double | x0, |
double | dx, | ||
Eigen::Ref< const Eigen::VectorXd > | y | ||
) |
Carries out univariate (aka simple) linear regression with intercept on regularly spaced points.
[in] | x0 | First X value. |
[in] | dx | Positive X increment. |
[in] | y | Y vector. |
std::invalid_argument | If y.size() < 2 . |
std::domain_error | If dx <= 0 . |
UnivariateOLSResult ml::LinearRegression::univariate_without_intercept | ( | Eigen::Ref< const Eigen::VectorXd > | x, |
Eigen::Ref< const Eigen::VectorXd > | y | ||
) |
Carries out univariate (aka simple) linear regression without intercept.
[in] | x | X vector. |
[in] | y | Y vector. |
intercept
, var_intercept
and cov_slope_intercept
set to 0. std::invalid_argument | If x and y have different sizes, or if their size is less than 1. |
MultivariateOLSResult ml::LinearRegression::multivariate | ( | Eigen::Ref< const Eigen::MatrixXd > | X, |
Eigen::Ref< const Eigen::VectorXd > | y | ||
) |
Carries out multivariate linear regression.
Given X and y, finds \( \vec{\beta} \) minimising \( \lVert \vec{y} - X^T \vec{\beta} \rVert^2 \).
If fitting with intercept is desired, include a row of 1's in the X values.
[in] | X | D x N matrix of X values, with data points in columns. |
[in] | y | Y vector with length N. |
std::invalid_argument | If y.size() != X.cols() or X.cols() < X.rows() . |
RidgeRegressionResult ml::LinearRegression::ridge | ( | Eigen::Ref< const Eigen::MatrixXd > | X, |
Eigen::Ref< const Eigen::VectorXd > | y, | ||
double | lambda | ||
) |
Carries out multivariate ridge regression with intercept.
Given X and y, finds \( \vec{\beta'} \) and \( \beta_0 \) minimising \( \lVert \vec{y} - X^T \vec{\beta'} - \beta_0 \rVert^2 + \lambda \lVert \vec{\beta'} \rVert^2 \), where \( \vec{\beta'} \) and \( \beta_0 \) are concatenated as RidgeRegressionResult::beta in the returned RidgeRegressionResult object.
The matrix X
is either assumed to be standardised (DoStandardise == false
) or is standardised internally (DoStandardise == true
; requires a matrix copy).
[in] | X | D x N matrix of X values, with data points in columns. Should NOT contain a row with all 1's. |
[in] | y | Y vector with length N. |
[in] | lambda | Regularisation strength. |
DoStandardise | Whether to standardise X internally. |
beta.size() == X.rows() + 1
. If DoStandardise == true
, beta
will be rescaled and shifted to original X
units and origins, and cov
will be transformed accordingly. std::invalid_argument | If y.size() != X.cols() or X.cols() < X.rows() . |
std::domain_error | If lambda < 0 . |
RidgeRegressionResult ml::LinearRegression::ridge< true > | ( | Eigen::Ref< const Eigen::MatrixXd > | X, |
Eigen::Ref< const Eigen::VectorXd > | y, | ||
double | lambda | ||
) |
Carries out multivariate ridge regression with intercept, standardising X
inputs internally.
RidgeRegressionResult ml::LinearRegression::ridge< false > | ( | Eigen::Ref< const Eigen::MatrixXd > | X, |
Eigen::Ref< const Eigen::VectorXd > | y, | ||
double | lambda | ||
) |
Carries out multivariate ridge regression with intercept, assuming standardised X
inputs.
|
inline |
Carries out multivariate ridge regression with intercept, allowing the user switch internal standardisation of X
data on or off.
LassoRegressionResult ml::LinearRegression::lasso | ( | Eigen::Ref< const Eigen::MatrixXd > | X, |
Eigen::Ref< const Eigen::VectorXd > | y, | ||
double | lambda | ||
) |
Carries out multivariate Lasso regression with intercept.
Given X and y, finds \( \vec{\beta'} \) and \( \beta_0 \) minimising \( \lVert \vec{y} - X^T \vec{\beta'} - \beta_0 \rVert^2 + \lambda \lVert \vec{\beta'} \rVert^1 \), where \( \vec{\beta'} \) and \( \beta_0 \) are concatenated as LassoRegressionResult::beta in the returned LassoRegressionResult object.
The matrix X
is either assumed to be standardised (DoStandardise == false
) or is standardised internally (DoStandardise == true
; requires a matrix copy).
Uses the iterated ridge regression method of Fan and Li (2001).
[in] | X | D x N matrix of X values, with data points in columns. Should NOT contain a row with all 1's. |
[in] | y | Y vector with length N. |
[in] | lambda | Regularisation strength. |
DoStandardise | Whether to standardise X internally. |
beta.size() == X.rows() + 1
. If DoStandardise == true
, beta
will be rescaled and shifted to original X
units and origins, and cov
will be transformed accordingly. std::invalid_argument | If y.size() != X.cols() or X.cols() < X.rows() . |
std::domain_error | If lambda < 0 . |
LassoRegressionResult ml::LinearRegression::lasso< true > | ( | Eigen::Ref< const Eigen::MatrixXd > | X, |
Eigen::Ref< const Eigen::VectorXd > | y, | ||
double | lambda | ||
) |
Carries out multivariate Lasso regression with intercept, standardising X
inputs internally.
LassoRegressionResult ml::LinearRegression::lasso< false > | ( | Eigen::Ref< const Eigen::MatrixXd > | X, |
Eigen::Ref< const Eigen::VectorXd > | y, | ||
double | lambda | ||
) |
Carries out multivariate Lasso regression with intercept, assuming standardised X
inputs.
|
inline |
Carries out multivariate Lasso regression with intercept, allowing the user switch internal standardisation of X
data on or off.
double ml::LinearRegression::press | ( | Eigen::Ref< const Eigen::MatrixXd > | X, |
Eigen::Ref< const Eigen::VectorXd > | y, | ||
Regression | regression | ||
) |
Calculates the PRESS statistic (Predicted Residual Error Sum of Squares).
See https://en.wikipedia.org/wiki/PRESS_statistic for details.
regression
must standardise the data internally (call ridge() with DoStandardise == true
).Regression | Functor type implementing particular regression. |
[in] | X | D x N matrix of X values, with data points in columns. |
[in] | y | Y vector with length N. |
[in] | regression | Regression functor. regression(X, y) should return a result object supporting a predict(X) call (e.g. MultivariateOLSResult). Must standardise the data internally if necessary. |
std::invalid_argument | If X.cols() != y.size() . |
double ml::LinearRegression::press_univariate | ( | Eigen::Ref< const Eigen::VectorXd > | x, |
Eigen::Ref< const Eigen::VectorXd > | y | ||
) |
Calculates the PRESS statistic (Predicted Residual Error Sum of Squares) for univariate regression.
See https://en.wikipedia.org/wiki/PRESS_statistic for details.
[in] | x | X vector with length N. |
[in] | y | Y vector with same length as x . |
WithIntercept | Whether the regression is with intercept or not. |
std::invalid_argument | If x.size() != y.size() . |
Eigen::MatrixXd ml::LinearRegression::add_ones | ( | Eigen::Ref< const Eigen::MatrixXd > | X | ) |
Adds another row with 1s in every column to X.
[in] | X | Matrix of independent variables with data points in columns. |
std::invalid_argument | If X.cols() == 0 . |
void ml::LinearRegression::standardise | ( | Eigen::Ref< Eigen::MatrixXd > | X | ) |
Standardises independent variables.
From each row, standardise
subtracts its mean and divides it by its standard deviation.
[in,out] | X | Matrix of independent variables with data points in columns. |
std::invalid_argument | If any row of X has all values the same, or X is empty. |
void ml::LinearRegression::standardise | ( | Eigen::Ref< Eigen::MatrixXd > | X, |
Eigen::VectorXd & | means, | ||
Eigen::VectorXd & | standard_deviations | ||
) |
Standardises independent variables.
From each row, standardise
subtracts its mean and divides it by its standard deviation.
This version of standardise
saves original mean and standard deviation for every row in provided vectors.
[in,out] | X | D x N matrix of independent variables with data points in columns. |
[out] | means | At exit has length D and contains means of rows of X . |
[out] | standard_deviations | At exit has length D and contains standard deviations of rows of X . If means and standard_deviations refer to the same vector, at exit this vector will contain the standard deviations. |
std::invalid_argument | If any row of X has all values the same, or X is empty. |
void ml::LinearRegression::unstandardise | ( | Eigen::Ref< Eigen::MatrixXd > | X, |
Eigen::Ref< const Eigen::VectorXd > | means, | ||
Eigen::Ref< const Eigen::VectorXd > | standard_deviations | ||
) |
Reverses the outcome of standardise().
From each row, standardise
multiplies it by its standard deviation and adds its mean.
[in,out] | X | D x N matrix of standardised independent variables with data points in columns. |
[in] | means | Means of rows of X . |
[in] | standard_deviations | Standard deviations of rows of X . |
std::invalid_argument | If X.rows() != means.size() or X.rows() != standard_deviations.size() . |
std::domain_error | If any element of standard_deviations is not positive. |