Clusterer Class Reference

#include <clusterer.h>

Inheritance diagram for Clusterer:

KMeansClusterer List of all members.

Detailed Description

Abstract interface for a flat clusterer.

There are two ways for a Clusterer to get its data: either by (1) performing an actual clustering computation, or (2) by reading pre-clustered data from a stream. This setup allows one to cluster some data, and then save the results to a file so it can be later read and further processed.

To create your own clusterer, all you need to do is implement the DoClustering() method. The I/O is handled automatically.


Public Member Functions

 Clusterer ()
virtual ~Clusterer ()
void Cluster (const vector< PointRef > &data)
 Performs the clustering and stores the result internally.
const PointSetcenters () const
 Output the cluster centers.
int centers_size () const
 Return the number of cluster centers.
int membership (int index) const
 Return the membership of the <index>th point.
int membership_size () const
 Return the number of members. Equivalent to the number of points that were clustered.
void WriteToStream (ostream &output_stream) const
 Write the clustering data to a stream.
void WriteToFile (const char *output_filename) const
 Write the clustering data to a file.
void ReadFromStream (istream &input_stream)
 Read clustering data from a stream.
void ReadFromFile (const char *input_filename)
 Read clustering data from a file.

Protected Member Functions

virtual void DoClustering (const vector< PointRef > &data)=0
 Performs the actual clustering.

Protected Attributes

auto_ptr< PointSetcluster_centers_
vector< int > membership_
bool done_


Constructor & Destructor Documentation

Clusterer (  ) 

virtual ~Clusterer (  )  [inline, virtual]


Member Function Documentation

void Cluster ( const vector< PointRef > &  data  ) 

Performs the clustering and stores the result internally.

To avoid potential memory problems, Clusterers do not operate on PointSetLists or PointSets directly. Rather, they simply shuffle PointRefs around.

See also:
PointSetList::GetPointRefs()

const PointSet & centers (  )  const

Output the cluster centers.

This requires Cluster() or ReadFromStream() to have been called first. It returns the set of all cluster centers as Points.

int centers_size (  )  const

Return the number of cluster centers.

This requires Cluster() or ReadFromStream() to have been called first. It reutnrs the number of cluster centers.

int membership ( int  index  )  const

Return the membership of the <index>th point.

The return value gives the ID of the cluster that this point belongs to. "ID" in this sense means an index into the PointSet returned by centers().

int membership_size (  )  const

Return the number of members. Equivalent to the number of points that were clustered.

void WriteToStream ( ostream &  output_stream  )  const

Write the clustering data to a stream.

Requires Cluster() or ReadFromStream() to have been called first. Output format:

void WriteToFile ( const char *  output_filename  )  const

Write the clustering data to a file.

void ReadFromStream ( istream &  input_stream  ) 

Read clustering data from a stream.

Can be called in lieu of Cluster(). If this is called after Cluster(), all of the previous data is cleared before reading the new data. For the file format, see WriteToStream. This function aborts if the stream is bad.

See also:
WriteToStream.

void ReadFromFile ( const char *  input_filename  ) 

Read clustering data from a file.

virtual void DoClustering ( const vector< PointRef > &  data  )  [protected, pure virtual]

Performs the actual clustering.

DoClustering is responsible for three things:

  1. Filling up cluster_centers_
  2. Filling up membership_ with a number of elements equal to data.size()
  3. Setting done_ to true.

Implemented in KMeansClusterer.


Member Data Documentation

auto_ptr<PointSet> cluster_centers_ [protected]

vector<int> membership_ [protected]

bool done_ [protected]


The documentation for this class was generated from the following files:
Generated on Fri Sep 21 11:39:05 2007 for libpmk2 by  doxygen 1.5.1