#include <k-means-clusterer.h>
Inheritance diagram for KMeansClusterer:
This implementation may not always return K clusters. There are two cases where we will return fewer than K clusters:
Both of these situations are generally unlikely, but you should be careful about the assumptions your code makes about the number of returned clusters.
Public Member Functions | |
KMeansClusterer (int num_clusters, int max_iters, const DistanceComputer &distance_computer) | |
void | Cluster (const vector< PointRef > &data) |
Performs the clustering and stores the result internally. | |
const PointSet & | centers () const |
Output the cluster centers. | |
int | centers_size () const |
Return the number of cluster centers. | |
int | membership (int index) const |
Return the membership of the <index>th point. | |
int | membership_size () const |
Return the number of members. Equivalent to the number of points that were clustered. | |
void | WriteToStream (ostream &output_stream) const |
Write the clustering data to a stream. | |
void | WriteToFile (const char *output_filename) const |
Write the clustering data to a file. | |
void | ReadFromStream (istream &input_stream) |
Read clustering data from a stream. | |
void | ReadFromFile (const char *input_filename) |
Read clustering data from a file. | |
Protected Member Functions | |
virtual void | DoClustering (const vector< PointRef > &data) |
Perform K-means. | |
Protected Attributes | |
auto_ptr< PointSet > | cluster_centers_ |
vector< int > | membership_ |
bool | done_ |
KMeansClusterer | ( | int | num_clusters, | |
int | max_iters, | |||
const DistanceComputer & | distance_computer | |||
) |
void DoClustering | ( | const vector< PointRef > & | data | ) | [protected, virtual] |
Perform K-means.
Uses the DistanceComputer it was constructed with to fill up cluster_centers_ with K Point representing the K-means cluster centers. K is assigned by the constructor of KMeansClusterer. If there are fewer data points than K, then the total number of clusters returned is simply the total number of data points (not K).
Implements Clusterer.
void Cluster | ( | const vector< PointRef > & | data | ) | [inherited] |
Performs the clustering and stores the result internally.
To avoid potential memory problems, Clusterers do not operate on PointSetLists or PointSets directly. Rather, they simply shuffle PointRefs around.
const PointSet & centers | ( | ) | const [inherited] |
Output the cluster centers.
This requires Cluster() or ReadFromStream() to have been called first. It returns the set of all cluster centers as Points.
int centers_size | ( | ) | const [inherited] |
Return the number of cluster centers.
This requires Cluster() or ReadFromStream() to have been called first. It reutnrs the number of cluster centers.
int membership | ( | int | index | ) | const [inherited] |
int membership_size | ( | ) | const [inherited] |
Return the number of members. Equivalent to the number of points that were clustered.
void WriteToStream | ( | ostream & | output_stream | ) | const [inherited] |
Write the clustering data to a stream.
Requires Cluster() or ReadFromStream() to have been called first. Output format:
The clustered points themselves are not written to the stream, only the centers and membership data. It is assumed that the caller of Cluster() already has access to those points anyway. This function aborts if the stream is bad.
void WriteToFile | ( | const char * | output_filename | ) | const [inherited] |
Write the clustering data to a file.
void ReadFromStream | ( | istream & | input_stream | ) | [inherited] |
Read clustering data from a stream.
Can be called in lieu of Cluster(). If this is called after Cluster(), all of the previous data is cleared before reading the new data. For the file format, see WriteToStream. This function aborts if the stream is bad.
void ReadFromFile | ( | const char * | input_filename | ) | [inherited] |
Read clustering data from a file.
auto_ptr<PointSet> cluster_centers_ [protected, inherited] |
vector<int> membership_ [protected, inherited] |
bool done_ [protected, inherited] |