#include <hierarchical-clusterer.h>
This clusterer does not use the Clusterer interface because its output involves multiple levels. However, similar to Clusterer, HierarchicalClusterer has the ability to get its data either by actually doing hierarchical K-means, or by reading already-processed data. The difference is that the returned cluster centers are a PointSetList, where each element in the list gives the centers for a particular level.
Public Member Functions | |
HierarchicalClusterer () | |
const Tree< PointTreeNode > & | centers () const |
Get the cluster centers. | |
int | membership (int index) const |
Report which leaf the <index>th point belongs to. | |
int | membership_size () const |
Return the number of points that were clustered. | |
void | IdentifyMemberIDPath (int index, list< int > *out) const |
Get the set of node IDs in the hierarchy that a point belongs to. | |
void | IdentifyMemberTreePath (int index, list< int > *out) const |
Get a path down the hierarchy that a point belongs to. | |
void | Cluster (int num_levels, int branch_factor, const vector< PointRef > &points, const DistanceComputer &distance_computer) |
Performs the actual clustering and stores the result internally. | |
void | WriteToStream (ostream &output_stream) const |
Write the clustered data to a stream. | |
void | WriteToFile (const char *output_filename) const |
Write the clustered data to a file. | |
void | ReadFromStream (istream &input_stream) |
Read clustered data from a stream. | |
void | ReadFromFile (const char *input_filename) |
Read the clustered data from a file. | |
Static Public Attributes | |
static const int | MAX_ITERATIONS |
Default 100. |
const Tree< PointTreeNode > & centers | ( | ) | const |
Get the cluster centers.
Must be called after Cluster() or ReadFromStream(). Returns a const ref to the tree of cluster centers. A PointTreeNode simply has a Point in it, which represents the cluster center.
int membership | ( | int | index | ) | const |
Report which leaf the <index>th point belongs to.
Returns the node ID of the leaf that this point belongs to. This ID can be used for lookups in the centers() tree.
int membership_size | ( | ) | const [inline] |
Return the number of points that were clustered.
void IdentifyMemberIDPath | ( | int | index, | |
list< int > * | out | |||
) | const |
Get the set of node IDs in the hierarchy that a point belongs to.
Outputs a list with size equal to the depth of the tree (or if the tree is not balanced, a size equal to the depth of the leaf node associated with a point). The elements of the returned list are the IDs of the nodes (starting with the root, ending at the leaf) this point belongs to at that level. The last element will be the ID of the root, which currently will always be 0.
void IdentifyMemberTreePath | ( | int | index, | |
list< int > * | out | |||
) | const |
Get a path down the hierarchy that a point belongs to.
Outputs a list representing a path down the tree. If the leaf is at depth n, then the output of this list has size (n-1). The first element of the list specifies which child (its index) of the root to go down. The second element specifies which child to go to after that, and so on. Obviously, this value is at most B (the branch factor), since each node will only have at most B children. The value tells you the index of the child link to traverse in the tree.
This method is useful for converting to sparse representations, such as Pyramids.
void Cluster | ( | int | num_levels, | |
int | branch_factor, | |||
const vector< PointRef > & | points, | |||
const DistanceComputer & | distance_computer | |||
) |
Performs the actual clustering and stores the result internally.
void WriteToStream | ( | ostream & | output_stream | ) | const |
Write the clustered data to a stream.
Must be called after Cluster() or ReadFromStream(). File format:
This function will abort if the stream is bad.
void WriteToFile | ( | const char * | output_filename | ) | const |
Write the clustered data to a file.
void ReadFromStream | ( | istream & | input_stream | ) |
Read clustered data from a stream.
Can be called in lieu of Cluster() to load preprocessed data. See WriteToStream() for the format. This function will abort if the stream is bad.
void ReadFromFile | ( | const char * | input_filename | ) |
Read the clustered data from a file.
const int MAX_ITERATIONS [static] |
Default 100.