HierarchicalClusterer Class Reference

#include <hierarchical-clusterer.h>

List of all members.


Detailed Description

Hierarchical K-Means clusterer.

This clusterer does not use the Clusterer interface because its output involves multiple levels. However, similar to Clusterer, HierarchicalClusterer has the ability to get its data either by actually doing hierarchical K-means, or by reading already-processed data. The difference is that the returned cluster centers are a PointSetList, where each element in the list gives the centers for a particular level.


Public Member Functions

 HierarchicalClusterer ()
const Tree< PointTreeNode > & centers () const
 Get the cluster centers.
int membership (int index) const
 Report which leaf the <index>th point belongs to.
int membership_size () const
 Return the number of points that were clustered.
void IdentifyMemberIDPath (int index, list< int > *out) const
 Get the set of node IDs in the hierarchy that a point belongs to.
void IdentifyMemberTreePath (int index, list< int > *out) const
 Get a path down the hierarchy that a point belongs to.
void Cluster (int num_levels, int branch_factor, const vector< PointRef > &points, const DistanceComputer &distance_computer)
 Performs the actual clustering and stores the result internally.
void WriteToStream (ostream &output_stream) const
 Write the clustered data to a stream.
void WriteToFile (const char *output_filename) const
 Write the clustered data to a file.
void ReadFromStream (istream &input_stream)
 Read clustered data from a stream.
void ReadFromFile (const char *input_filename)
 Read the clustered data from a file.

Static Public Attributes

static const int MAX_ITERATIONS
 Default 100.


Constructor & Destructor Documentation

HierarchicalClusterer (  ) 


Member Function Documentation

const Tree< PointTreeNode > & centers (  )  const

Get the cluster centers.

Must be called after Cluster() or ReadFromStream(). Returns a const ref to the tree of cluster centers. A PointTreeNode simply has a Point in it, which represents the cluster center.

See also:
Tree

int membership ( int  index  )  const

Report which leaf the <index>th point belongs to.

Returns the node ID of the leaf that this point belongs to. This ID can be used for lookups in the centers() tree.

See also:
IdentifyMemberIDPath()

IdentifyMemberTreePath()

int membership_size (  )  const [inline]

Return the number of points that were clustered.

void IdentifyMemberIDPath ( int  index,
list< int > *  out 
) const

Get the set of node IDs in the hierarchy that a point belongs to.

Outputs a list with size equal to the depth of the tree (or if the tree is not balanced, a size equal to the depth of the leaf node associated with a point). The elements of the returned list are the IDs of the nodes (starting with the root, ending at the leaf) this point belongs to at that level. The last element will be the ID of the root, which currently will always be 0.

See also:
IdentifyMemberTreePath()

void IdentifyMemberTreePath ( int  index,
list< int > *  out 
) const

Get a path down the hierarchy that a point belongs to.

Outputs a list representing a path down the tree. If the leaf is at depth n, then the output of this list has size (n-1). The first element of the list specifies which child (its index) of the root to go down. The second element specifies which child to go to after that, and so on. Obviously, this value is at most B (the branch factor), since each node will only have at most B children. The value tells you the index of the child link to traverse in the tree.

This method is useful for converting to sparse representations, such as Pyramids.

See also:
IdentifyMemberIDPath()

void Cluster ( int  num_levels,
int  branch_factor,
const vector< PointRef > &  points,
const DistanceComputer distance_computer 
)

Performs the actual clustering and stores the result internally.

void WriteToStream ( ostream &  output_stream  )  const

Write the clustered data to a stream.

Must be called after Cluster() or ReadFromStream(). File format:

This function will abort if the stream is bad.

void WriteToFile ( const char *  output_filename  )  const

Write the clustered data to a file.

void ReadFromStream ( istream &  input_stream  ) 

Read clustered data from a stream.

Can be called in lieu of Cluster() to load preprocessed data. See WriteToStream() for the format. This function will abort if the stream is bad.

See also:
WriteToStream()

void ReadFromFile ( const char *  input_filename  ) 

Read the clustered data from a file.


Member Data Documentation

const int MAX_ITERATIONS [static]

Default 100.


The documentation for this class was generated from the following files:
Generated on Fri Sep 21 11:39:05 2007 for libpmk2 by  doxygen 1.5.1