Naive K-means clustering method. More...
#include <KMeans.hpp>


Public Member Functions | |
| KMeans (unsigned int number_clusters) | |
| Constructs a K-means model ready to fit. More... | |
| bool | fit (Eigen::Ref< const Eigen::MatrixXd > data) override |
| Fits the model. More... | |
| unsigned int | number_clusters () const override |
| Returns the number of clusters. More... | |
| const std::vector< unsigned int > & | labels () const override |
| Returns a const reference to resulting cluster labels for each datapoint. Value make sense only if fitting converged successfully. | |
| const Eigen::MatrixXd & | centroids () const override |
| Returns a const reference to the matrix of cluster centroids (in columns). More... | |
| void | set_seed (unsigned int seed) |
| Sets PRNG seed. More... | |
| void | set_absolute_tolerance (double absolute_tolerance) |
| Sets absolute tolerance for convergence test: || old centroids - new centroids ||^2 < absolute tolerance. More... | |
| void | set_maximum_steps (unsigned int maximum_steps) |
| Sets maximum number of K-means steps. More... | |
| void | set_number_initialisations (unsigned int number_initialisations) |
| Sets number of initialisations to try, to find the clusters with lowest inertia. More... | |
| void | set_centroids_initialiser (std::shared_ptr< const CentroidsInitialiser > centroids_initialiser) |
| Sets centroids initialiser. More... | |
| void | set_verbose (bool verbose) |
| Switches between verbose and quiet mode. More... | |
| std::pair< unsigned int, double > | assign_label (Eigen::Ref< const Eigen::VectorXd > x) const |
| Given a data point x, assign it to its cluster and return the correct label and squared Euclidean distance to the assigned centroid. More... | |
| double | inertia () const |
| Sum of squared distances to the nearest centroid. More... | |
| bool | converged () const override |
| Reports if the model converged. | |
Public Member Functions inherited from ml::Clustering::Model | |
| virtual | ~Model () |
| Virtual destructor. | |
Naive K-means clustering method.
Converges if exactly the same cluster assignments are chosen twice, or if sum of squared differences between new and old centroids is lower than tolerance.
| ml::Clustering::KMeans::KMeans | ( | unsigned int | number_clusters | ) |
Constructs a K-means model ready to fit.
| [in] | number_clusters | Number of clusters. |
| std::invalid_argument | If number_clusters == 0. |
|
overridevirtual |
Fits the model.
| [in] | data | Matrix (column-major order) with a data point in every column. |
true if fitting converged successfully. | std::invalid_argument | If data has no rows, or if the sample size (number of columns in data) is too low. |
Implements ml::Clustering::Model.
|
inlineoverridevirtual |
Returns the number of clusters.
Value make sense only if fitting converged successfully.
Implements ml::Clustering::Model.
|
inlineoverridevirtual |
Returns a const reference to the matrix of cluster centroids (in columns).
A centroid represent the central location of the cluster. It is e.g. a mean of all points in the cluster. Value make sense only if fitting converged successfully.
Implements ml::Clustering::Model.
| void ml::Clustering::KMeans::set_seed | ( | unsigned int | seed | ) |
Sets PRNG seed.
| [in] | seed | PRNG seed. |
| void ml::Clustering::KMeans::set_absolute_tolerance | ( | double | absolute_tolerance | ) |
Sets absolute tolerance for convergence test: || old centroids - new centroids ||^2 < absolute tolerance.
| [in] | absolute_tolerance | Absolute tolerance. |
| std::domain_error | If absolute_tolerance < 0. |
| void ml::Clustering::KMeans::set_maximum_steps | ( | unsigned int | maximum_steps | ) |
Sets maximum number of K-means steps.
| [in] | maximum_steps | Maximum number of steps. |
| std::invalid_argument | If maximum_steps < 2. |
| void ml::Clustering::KMeans::set_number_initialisations | ( | unsigned int | number_initialisations | ) |
Sets number of initialisations to try, to find the clusters with lowest inertia.
| number_initialisations | Number of initialisations. |
| std::invalid_argument | If number_initialisations < 1. |
| void ml::Clustering::KMeans::set_centroids_initialiser | ( | std::shared_ptr< const CentroidsInitialiser > | centroids_initialiser | ) |
Sets centroids initialiser.
| [in] | centroids_initialiser | Pointer to CentroidsInitialiser implementation. |
| std::invalid_argument | If centroids_initialiser is null. |
|
inline |
Switches between verbose and quiet mode.
| [in] | verbose | true if we want verbose output. |
| std::pair<unsigned int, double> ml::Clustering::KMeans::assign_label | ( | Eigen::Ref< const Eigen::VectorXd > | x | ) | const |
Given a data point x, assign it to its cluster and return the correct label and squared Euclidean distance to the assigned centroid.
| [in] | x | Data point with correct dimension. |
| std::invalid_argument | If x.size() != means().rows(). |
|
inline |
Sum of squared distances to the nearest centroid.