Statistical functions. More...
Functions | |
template<class Iter > | |
std::pair< double, double > | sse_and_mean (const Iter begin, const Iter end) |
Calculates the average and sum of squared error for a sample. More... | |
template<class Iter > | |
double | sse (const Iter begin, const Iter end) |
Calculates the average and sum of squared error for a sample. More... | |
template<class Iter > | |
std::pair< double, unsigned int > | gini_index_and_mode (const Iter begin, const Iter end, const unsigned int K) |
Calculates the Gini index of the sample. More... | |
template<class Iter > | |
double | gini_index (const Iter begin, const Iter end, const unsigned int K) |
Calculates the Gini index of the sample. More... | |
template<class Iter > | |
unsigned int | mode (const Iter begin, const Iter end, const unsigned int K) |
Calculates the mode (most frequent value) of a sample. More... | |
template<class R > | |
R | covariance (const std::vector< R > &xs, const std::vector< R > &ys) |
Calculates sample covariance of two vectors. More... | |
double | covariance (Eigen::Ref< const Eigen::VectorXd > xs, Eigen::Ref< const Eigen::VectorXd > ys) |
Calculates sample covariance of two vectors. More... | |
Statistical functions.
std::pair<double, double> ml::Statistics::sse_and_mean | ( | const Iter | begin, |
const Iter | end | ||
) |
Calculates the average and sum of squared error for a sample.
Given a range [begin, end)
with N values, calculates
\( \mathrm{SSE} = \sum_{i=1}^{N} (x_i - \bar{x})^2 \)
and
\( \bar{x} = N^{-1} \sum_{i=1}^{N} x_i \).
[in] | begin | Iterator pointing to the beginning of the range of sample values. |
[in] | end | Iterator pointing one past to the end of the range of sample values. |
Iter | Iterator type. |
double ml::Statistics::sse | ( | const Iter | begin, |
const Iter | end | ||
) |
Calculates the average and sum of squared error for a sample.
Given a range [begin, end)
with N values, calculates
\( \mathrm{SSE} = \sum_{i=1}^{N} (x_i - \bar{x})^2 \),
where
\( \bar{x} = N^{-1} \sum_{i=1}^{N} x_i \).
[in] | begin | Iterator pointing to the beginning of the range of sample values. |
[in] | end | Iterator pointing one past to the end of the range of sample values. |
Iter | Iterator type. |
std::pair<double, unsigned int> ml::Statistics::gini_index_and_mode | ( | const Iter | begin, |
const Iter | end, | ||
const unsigned int | K | ||
) |
Calculates the Gini index of the sample.
Gini index is defined as
\( \sum_{k=1}^K \hat{p}_k (1 - \hat{p}_k) \)
where \(\hat{p}_k\) is the frequency of occurrence of class k
in data.
Takes as argument a range [begin, end)
of class values from 0 to K - 1
.
[in] | begin | Iterator pointing to the beginning of the range of sample values. |
[in] | end | Iterator pointing one past to the end of the range of sample values. |
[in] | K | Number of classes, positive. |
Iter | Iterator type. |
begin == end
, mode == K
. double ml::Statistics::gini_index | ( | const Iter | begin, |
const Iter | end, | ||
const unsigned int | K | ||
) |
Calculates the Gini index of the sample.
Gini index is defined as
\( \sum_{k=1}^K \hat{p}_k (1 - \hat{p}_k) \)
where \(\hat{p}_k\) is the frequency of occurrence of class k
in data.
Takes as argument a range [begin, end)
of class values from 0 to K - 1
.
[in] | begin | Iterator pointing to the beginning of the range of sample values. |
[in] | end | Iterator pointing one past to the end of the range of sample values. |
[in] | K | Number of classes, positive. |
Iter | Iterator type. |
unsigned int ml::Statistics::mode | ( | const Iter | begin, |
const Iter | end, | ||
const unsigned int | K | ||
) |
Calculates the mode (most frequent value) of a sample.
The sample is assumed to contain values in the [0, K - 1]
range.
[in] | begin | Iterator pointing to the beginning of the range of sample values. |
[in] | end | Iterator pointing one past to the end of the range of sample values. |
[in] | K | Positive number of distinct values. |
Iter | Iterator type. |
R ml::Statistics::covariance | ( | const std::vector< R > & | xs, |
const std::vector< R > & | ys | ||
) |
Calculates sample covariance of two vectors.
[in] | xs | X values. |
[in] | ys | Y values. |
R | Scalar value type. |
xs.size() < 2
.std::invalid_argument | If xs.size() != ys.size() . |
double ml::Statistics::covariance | ( | Eigen::Ref< const Eigen::VectorXd > | xs, |
Eigen::Ref< const Eigen::VectorXd > | ys | ||
) |
Calculates sample covariance of two vectors.
xs | X values. |
ys | Y values. |
xs.size() < 2
.std::invalid_argument | If xs.size() != ys.size() . |