Statistical functions. More...
Functions | |
| template<class Iter > | |
| std::pair< double, double > | sse_and_mean (const Iter begin, const Iter end) |
| Calculates the average and sum of squared error for a sample. More... | |
| template<class Iter > | |
| double | sse (const Iter begin, const Iter end) |
| Calculates the average and sum of squared error for a sample. More... | |
| template<class Iter > | |
| std::pair< double, unsigned int > | gini_index_and_mode (const Iter begin, const Iter end, const unsigned int K) |
| Calculates the Gini index of the sample. More... | |
| template<class Iter > | |
| double | gini_index (const Iter begin, const Iter end, const unsigned int K) |
| Calculates the Gini index of the sample. More... | |
| template<class Iter > | |
| unsigned int | mode (const Iter begin, const Iter end, const unsigned int K) |
| Calculates the mode (most frequent value) of a sample. More... | |
| template<class R > | |
| R | covariance (const std::vector< R > &xs, const std::vector< R > &ys) |
| Calculates sample covariance of two vectors. More... | |
| double | covariance (Eigen::Ref< const Eigen::VectorXd > xs, Eigen::Ref< const Eigen::VectorXd > ys) |
| Calculates sample covariance of two vectors. More... | |
Statistical functions.
| std::pair<double, double> ml::Statistics::sse_and_mean | ( | const Iter | begin, |
| const Iter | end | ||
| ) |
Calculates the average and sum of squared error for a sample.
Given a range [begin, end) with N values, calculates
\( \mathrm{SSE} = \sum_{i=1}^{N} (x_i - \bar{x})^2 \)
and
\( \bar{x} = N^{-1} \sum_{i=1}^{N} x_i \).
| [in] | begin | Iterator pointing to the beginning of the range of sample values. |
| [in] | end | Iterator pointing one past to the end of the range of sample values. |
| Iter | Iterator type. |
| double ml::Statistics::sse | ( | const Iter | begin, |
| const Iter | end | ||
| ) |
Calculates the average and sum of squared error for a sample.
Given a range [begin, end) with N values, calculates
\( \mathrm{SSE} = \sum_{i=1}^{N} (x_i - \bar{x})^2 \),
where
\( \bar{x} = N^{-1} \sum_{i=1}^{N} x_i \).
| [in] | begin | Iterator pointing to the beginning of the range of sample values. |
| [in] | end | Iterator pointing one past to the end of the range of sample values. |
| Iter | Iterator type. |
| std::pair<double, unsigned int> ml::Statistics::gini_index_and_mode | ( | const Iter | begin, |
| const Iter | end, | ||
| const unsigned int | K | ||
| ) |
Calculates the Gini index of the sample.
Gini index is defined as
\( \sum_{k=1}^K \hat{p}_k (1 - \hat{p}_k) \)
where \(\hat{p}_k\) is the frequency of occurrence of class k in data.
Takes as argument a range [begin, end) of class values from 0 to K - 1.
| [in] | begin | Iterator pointing to the beginning of the range of sample values. |
| [in] | end | Iterator pointing one past to the end of the range of sample values. |
| [in] | K | Number of classes, positive. |
| Iter | Iterator type. |
begin == end, mode == K. | double ml::Statistics::gini_index | ( | const Iter | begin, |
| const Iter | end, | ||
| const unsigned int | K | ||
| ) |
Calculates the Gini index of the sample.
Gini index is defined as
\( \sum_{k=1}^K \hat{p}_k (1 - \hat{p}_k) \)
where \(\hat{p}_k\) is the frequency of occurrence of class k in data.
Takes as argument a range [begin, end) of class values from 0 to K - 1.
| [in] | begin | Iterator pointing to the beginning of the range of sample values. |
| [in] | end | Iterator pointing one past to the end of the range of sample values. |
| [in] | K | Number of classes, positive. |
| Iter | Iterator type. |
| unsigned int ml::Statistics::mode | ( | const Iter | begin, |
| const Iter | end, | ||
| const unsigned int | K | ||
| ) |
Calculates the mode (most frequent value) of a sample.
The sample is assumed to contain values in the [0, K - 1] range.
| [in] | begin | Iterator pointing to the beginning of the range of sample values. |
| [in] | end | Iterator pointing one past to the end of the range of sample values. |
| [in] | K | Positive number of distinct values. |
| Iter | Iterator type. |
| R ml::Statistics::covariance | ( | const std::vector< R > & | xs, |
| const std::vector< R > & | ys | ||
| ) |
Calculates sample covariance of two vectors.
| [in] | xs | X values. |
| [in] | ys | Y values. |
| R | Scalar value type. |
xs.size() < 2.| std::invalid_argument | If xs.size() != ys.size(). |
| double ml::Statistics::covariance | ( | Eigen::Ref< const Eigen::VectorXd > | xs, |
| Eigen::Ref< const Eigen::VectorXd > | ys | ||
| ) |
Calculates sample covariance of two vectors.
| xs | X values. |
| ys | Y values. |
xs.size() < 2.| std::invalid_argument | If xs.size() != ys.size(). |