#include <clusterer.h>
Inheritance diagram for Clusterer:
There are two ways for a Clusterer to get its data: either by (1) performing an actual clustering computation, or (2) by reading pre-clustered data from a stream. This setup allows one to cluster some data, and then save the results to a file so it can be later read and further processed.
To create your own clusterer, all you need to do is implement the DoClustering() method. The I/O is handled automatically.
Public Member Functions | |
Clusterer () | |
virtual | ~Clusterer () |
void | Cluster (const vector< PointRef > &data) |
Performs the clustering and stores the result internally. | |
const PointSet & | centers () const |
Output the cluster centers. | |
int | centers_size () const |
Return the number of cluster centers. | |
int | membership (int index) const |
Return the membership of the <index>th point. | |
int | membership_size () const |
Return the number of members. Equivalent to the number of points that were clustered. | |
void | WriteToStream (ostream &output_stream) const |
Write the clustering data to a stream. | |
void | WriteToFile (const char *output_filename) const |
Write the clustering data to a file. | |
void | ReadFromStream (istream &input_stream) |
Read clustering data from a stream. | |
void | ReadFromFile (const char *input_filename) |
Read clustering data from a file. | |
Protected Member Functions | |
virtual void | DoClustering (const vector< PointRef > &data)=0 |
Performs the actual clustering. | |
Protected Attributes | |
auto_ptr< PointSet > | cluster_centers_ |
vector< int > | membership_ |
bool | done_ |
Clusterer | ( | ) |
virtual ~Clusterer | ( | ) | [inline, virtual] |
void Cluster | ( | const vector< PointRef > & | data | ) |
Performs the clustering and stores the result internally.
To avoid potential memory problems, Clusterers do not operate on PointSetLists or PointSets directly. Rather, they simply shuffle PointRefs around.
const PointSet & centers | ( | ) | const |
Output the cluster centers.
This requires Cluster() or ReadFromStream() to have been called first. It returns the set of all cluster centers as Points.
int centers_size | ( | ) | const |
Return the number of cluster centers.
This requires Cluster() or ReadFromStream() to have been called first. It reutnrs the number of cluster centers.
int membership | ( | int | index | ) | const |
int membership_size | ( | ) | const |
Return the number of members. Equivalent to the number of points that were clustered.
void WriteToStream | ( | ostream & | output_stream | ) | const |
Write the clustering data to a stream.
Requires Cluster() or ReadFromStream() to have been called first. Output format:
The clustered points themselves are not written to the stream, only the centers and membership data. It is assumed that the caller of Cluster() already has access to those points anyway. This function aborts if the stream is bad.
void WriteToFile | ( | const char * | output_filename | ) | const |
Write the clustering data to a file.
void ReadFromStream | ( | istream & | input_stream | ) |
Read clustering data from a stream.
Can be called in lieu of Cluster(). If this is called after Cluster(), all of the previous data is cleared before reading the new data. For the file format, see WriteToStream. This function aborts if the stream is bad.
void ReadFromFile | ( | const char * | input_filename | ) |
Read clustering data from a file.
virtual void DoClustering | ( | const vector< PointRef > & | data | ) | [protected, pure virtual] |
Performs the actual clustering.
DoClustering is responsible for three things:
Implemented in KMeansClusterer.
auto_ptr<PointSet> cluster_centers_ [protected] |
vector<int> membership_ [protected] |
bool done_ [protected] |