libpmk Documentation

Contents

  1. Unpacking and Compiling the Code
  2. Quick Start: Overview of Packaged Tools
  3. Using libpmk
    1. Point sets
    2. Clustering
    3. Creating pyramids
    4. Matching pyramids
  4. Useful Extra Features
    1. Kernel Matrices
    2. Selectors and Experiments
    3. Tree structures
      1. Sparse trees
      2. Dense trees
  5. Frequently Asked Questions
    1. How can I improve performance with uniform bins?
    2. What is the difference between UniformPyramidMaker and VGPyramidMaker? When should I use which?
    3. Why is UniformPyramidMaker slow compared to the previously reported speeds?
    4. Does this work on Windows? What about Cygwin?
    5. I am having trouble generating my own PointSetList files.

1. Unpacking and Compiling the Code

The following sequence of commands will unpack and compile libpmk.
$ tar zxvf libpmk-1.0.tar.gz
$ cd libpmk-1.0
$ make libpmk
$ make libpmk_util
$ make tools

There are two libraries to compile. The first one, libpmk.o, contains the core PMK functionality. The second one, libpmk_util.o, is some code that facilitates writing code to run experiments. To compile these, you can do 'make libpmk' and 'make libpmk_util', respectively.

There are also a number of executables in this source package, all in the "tools" directory. These are general-purpose tools, mostly meant to help give users a starting point for writing code, but can be used to actually run useful tests. To compile these, run 'make tools'.


2. Quick Start: Overview of Packaged Tools

This section gives a step-by-step example of generating a vocabulary-guided Pyramid Match Kernel matrix from some given data. There are two required inputs for this chain of commands: (1) a file we will call "training.psl" which contains features for the images we will build a vocabulary tree over, and (2) a file called "data.psl" which contains features for the images that the kernel matrix will be generated for. For information on the format of these files, see the description of PointSetList in the class reference.

  1. Run hierarchical k-means to generate a vocabulary tree
    $ ./hierarchical-cluster-point-set.out training.psl clusters.hc 4 10
    

    This command will cluster the features in pointsets.psl. It will create a vocabulary tree of depth 4 and branch factor 10 (the root node counts as a level, i.e., the bottom level of the tree will have 10^3 nodes). It will store the result of the clustering in clusters.hc.

  2. Use the vocabulary tree to create a pyramid for each image
    $ ./clusters-to-pyramids.out data.psl clusters.hc pyramids.mrh
    

    The training.psl and clusters.hc files are needed to set up the parameters for the pyramids we are about to make. The output, pyramids.mrh, will contain a pyramid for each image in data.psl.

  3. Compute the kernel values from the pyramids
    $ ./pyramid-match-kernel pyramids.mrh output.kern
    

    This command will take each pyramid and compute the kernel values pairwise, writing the result to output.kern.


3. Using libpmk

3.1. Point sets

Most of the time, the input data we will be working with will be organized in a PointSetList (the format of this file is specified in the class reference). As the name suggests, it is simply a list of PointSet objects, which, in turn, are simply lists of Point objects. PointSet and PointSetList require that all of the Points contained within them have the same dimension.

PointSetList is an abstract class. libpmk provides two implementations: MutablePointSetList and OnDiskPointSetList. A MutablePointSetList resides entirely in memory, and the user is free to add and remove PointSets and Points. The following code segment shows how to load data into a MutablePointSetList and print out the number of points there are in each point set.

// Load the point set list
MutablePointSetList psl;
psl.ReadFromFile("file_to_load.psl");

for (int ii = 0; ii < psl.size(); ++ii) {
   printf("Point set %d has %d points\n", ii, psl.point_set(ii).size());
}

The OnDiskPointSetList is read-only and is suitable for loading data that is too large to fit in memory. For performance, OnDiskPointSetList operates using two caches: an LRU cache (of sorts) which remembers the most recently used PointSets, and what we call an "area" cache, which caches a contiguous block of PointSets. In general, if it is known that a particular application access data in a sequential manner, it is generally more useful to make the LRU cache small (size 1) and the area cache large. If the application uses mostly random access, it is better to have a large LRU cache and a very small area cache.

string input_file = "file_to_load.psl";

// We are printing things sequentially, so make the LRU cache small (size 1)
OnDiskPointSetList psl(input_file, 1, 100);

for (int ii = 0; ii < psl.size(); ++ii) {
   printf("Point set %d has %d points\n", ii, psl.point_set(ii).size());
}

3.2. Clustering

The clustering implementation given by this library is not specific to libpmk, but it does provide the functionality required to create the pyramids and may be useful for other related tasks. libpmk currently only provides k-means clustering, but it is easy to implement your own clustering method in this framework by subclassing the abstract Clusterer class.

All of the clustering code operates on lists of PointRef objects (rather than PointSets or PointSetLists). This is because we may want to cluster more than just the points in one PointSetList, or we may only want to run the clustering on a subset of the points. A PointRef can be thought of as simply a pointer referencing a Point. The PointSetList class provides functionality to generate PointRefs for you.

This example shows the use of PointSetList::ReadFromStream() which gives some flexibility in where the data is (it doesn't necessarily have to be stored in a file, although in this case it uses a file stream).

// Load a point set list (may also use OnDiskPointSetList if you want)
MutablePointSetList psl;
ifstream input_stream(input_file.c_str(), ios::binary);
psl.ReadFromStream(input_stream);
input_stream.close();

// Get a list of PointRefs to all of the points in psl
vector<PointRef> point_refs;
psl.GetPointRefs(&point_refs);

// Initialize the distance function we're using for the clustering
L2DistanceComputer dist_computer;

// Create a clusterer which will terminate after 500 iterations
KMeansClusterer clusterer(NUM_CLUSTERS, 500, dist_computer);
clusterer.Cluster(point_refs);

In the above example, we used L2DistanceComputer, which is built into libpmk. You may want to substitute your own distance computer for this. To do so, simply subclass the DistanceComputer (abstract) class and implement the ComputeDistance() function, which computes the distance between two Point objects.

Once the clustering is finished, the Clusterer object can be queried to find out which points belong in which cluster. You may also choose to save the data:

string output_file = "k-means-output.cluster";
// Instead of the below four lines, you may also use WriteToFile().
ofstream output_stream(output_file.c_str(),
                       ios::binary | ios::out | ios::trunc);
clusterer.WriteToStream(output_stream);
output_stream.close();
This way, the next time you want to access the data, rather than performing the actual clustering computation, you can simply load the data from the file:
ifstream input_stream("k-means-output.cluster", ios::binary);
clusterer.ReadFromStream(input_stream); // instead of clusterer.Cluster()
input_stream.close();
This scheme works for HierarchicalClusterer as well (which actually uses KMeansClusterer to perform each level of clustering). The following code fragment will run hierarchical clustering on a PointSetList and save the output to output_file.
vector<PointRef> point_refs;
psl.GetPointRefs(&point_refs);

HierarchicalClusterer clusterer;
L2DistanceComputer dist_computer;
clusterer.Cluster(num_levels, branch_factor, point_refs, dist_computer);

ofstream output_stream(output_file.c_str(),
                       ios::binary | ios::out | ios::trunc);
clusterer.WriteToStream(output_stream);
output_stream.close();

3.3. Creating pyramids

libpmk provides three different methods of turning PointSets into pyramids (MultiResolutionHistograms). The three methods all share a common interface for creating new pyramids from PointSets, but they are prepared in different ways. For the following examples, suppose we have loaded a PointSetList called psl.
  1. Uniform bin sizes
    UniformPyramidMaker upm;
    upm.Preprocess(psl, finest_side_length, side_length_factor,
                   discretize_order, true, true);
    

    There are a number of parameters that UniformPyramidMaker requires. finest_side_length is the length of one side of the smallest bin. side_length_factor is the amount the length of a side of a bin increases from one level to the next. The discretize_order parameter is particularly important because it can greatly affect the shape of the resulting pyramid. This parameter multiplies all of the feature values by 10^{discretize_order}. We typically set it in such a way that the resulting features are on the order of 10^2. The final two boolean parameters specify whether we want to use random shifts, and whether to apply the same shifts to each level, respectively.

    Just like Clusterer, it is possible to avoid doing redundant computations. UniformPyramidMaker supports ReadFromStream() and WriteToStream() functions. You may call ReadFromStream() on already-preprocessed data in lieu of calling Preprocess.

  2. Vocabulary-guided: global weights
    HierarchicalClusterer clusterer;
    L2DistanceComputer dist;
    clusterer.Cluster(...);
    
    GlobalVGPyramidMaker vgpm(clusterer, dist);
    vgpm.Preprocess(psl);
    

    The vocabulary-guided pyramids require data from HierarchicalClusterer. The above code will cluster a PointSetList and initialize a PyramidMaker with global weights. Again, since GlobalVGPyramidMaker has a ReadFromStream and WriteToStream function, you can save the data to a file and read it later to avoid doing redundant computations.

  3. Vocabulary-guided: input-specific weights
    HierarchicalClusterer clusterer;
    L2DistanceComputer dist;
    clusterer.Cluster(...);
    InputSpecificVGPyramidMaker vgpm(clusterer, dist);
    
    With input-specific weights, no preprocessing is needed, since the weights for each bin depend on the PointSet we are converting.

Once we have set up a PyramidMaker, we can start creating pyramids from PointSets:

MultiResolutionHistogram* pyramid = pyramid_maker.MakePyramid(point_set);
Or, given a PointSetList, we can do them all at once:
vector<MultiResolutionHistogram*> vec = pyramid_maker.MakePyramids(psl);
N.B.: PyramidMaker allocates memory for these pyramids and returns pointers to them. It is the caller's responsibility to free the memory!

3.4. Matching pyramids

The pyramid matching framework is simple to use. Suppose we have two MultiResolutionHistogram objects mrh1 and mrh2 (read from disk, computed by PyramidMaker, or by any other means):

double value = PyramidMatcher::GetPyramidMatchCost(mrh1, mrh2, BIN_WEIGHT_GLOBAL);

The third argument should reflect the method of creating the pyramids (BIN_WEIGHT_INPUT_SPECIFIC should be used for input-specific weights). For uniform pyramids, use BIN_WEIGHT_GLOBAL. GetPyramidMatchCost() takes the weight of the bin to be its size.

We compute the similarity by taking the weight of the bin to be the reciprocal of the bin size. This is also built-in:

double value = PyramidMatcher::GetPyramidMatchSimilarity(mrh1, mrh2, BIN_WEIGHT_GLOBAL);

4. Useful Extra Features

4.1. Kernel matrices

libpmk_util provides a simple KernelMatrix class which is a resizable symmetric square matrix (KernelMatrix will enforce symmetry). Elements are accessed with the at() method:

km.at(i, j) = 3;
cout << km.at(j, i) << endl; // will output 3

KernelMatrix provides two normalization options. The first, Normalize(), will divide the element k[r][c] by by sqrt(k[r][r] * k[c][c]).

The second normalization option, NormalizeByMinCardinality(), takes information about a PointSetList (namely, the number of points in each PointSet), and divides k[r][c] by the minimum of the size of the rth point set and the size of the cth point set. PointSetList provides an easy way of getting the set cardinalities:

km.NormalizeByMinCardinality(psl.GetCardinalities());

4.2. Selectors and Experiments

To facilitate running experiments, you can make use of the ExampleSelector and Experiment classes. ExampleSelector is an abstract class which will choose training and testing examples. Experiment will train a model and make predictions on the test examples. The actual implementations of ExampleSelector and Experiment will vary according to the experiment, but these classes are meant to provide a general framework in which you can run them. In the following example, we will be using RandomSelector, which chooses a random sample of the data to use as ttraining examples, and SVMExperiment, which wraps around LIBSVM.

Suppose we have a database of n labeled images. Let labels be a vector of ints of size n, where each element in the vector is the label of that image. We use RandomSelector as follows:

RandomSelector selector(labels, num_training_examples_per_class);
vector<LabeledIndex> train_examples(selector.GetTrainingExamples());
vector<LabeledIndex> test_examples(selector.GetTestingExamples());

If your needs are different, it is easy to create your own subclass of ExampleSelector. The only function you will need to implement is SelectExamples(); see the class reference for details on how to do this. It is also instructive to look at the implementation of RandomSelector

Once the training and test examples are selected, we can throw it into SVMExperiment as follows:

SVMExperiment experiment(train_examples, test_examples, kernel, C);
experiment.Train();
int num_correct = experiment.Test();

Test() reports the total number of test examples that were correctly classified. However, for more detailed information, we can query the SVMExperiment object for more data, such as what each prediction was.

4.3. Tree structures

4.3.1 Sparse trees

libpmk provides a SparseTree class template, which encapsulates a linked-list sparse tree representation. The tree is set up so that you can quickly traverse parent/child relationships and sibling relationships. A Tree provides two ways of accessing the data within it: (1) either by giving it a path down the tree (specified by a LargeIndex, which is a list of integers), or (2) by starting at the root node (or any node you have a pointer to) and traversing the parent, child, or sibling pointers. Tree will automatically manage the parent/child/sibling pointers.

The nodes of trees are SparseTreeNode objects. As mentioned above, SparseTreeNode objects inherently store some data (a LargeIndex specifying a path, as well as parent/child/sibling pointers). You need to extend the SparseTreeNode class and augment it with any data you wish to store in the tree. There are also three functions you need to implement: (1) serializing your data, (2) deserializing your data, and (3) combining data. The first two are needed so that a Tree will be able to write its contents to disk. We need to know how to "combine" data as well; for instance, suppose there is a node in the tree with an index X, and suppose we want to insert a new node, which also has index X, into the tree. Tree will clump them into one node, and it is up to you to tell it how to do so.

N.B.: When inserting nodes into trees, the SparseTree makes a copy of it. Be aware of this, especially if your SparseTreeNode contains pointers or complex data! You need to implement copy constructor to handle this case. Also remember that SparseTree needs to be able to serialize itself, so if your data is so complex that a copy constructor is too hard to write, it'll probably also be as hard on you to serialize it.

The implementation of Bin gives a straightfoward example on how to extend and implement SparseTreeNode. A MultiResolutionHistogram is implemented as SparseTree<Bin>.

If you have build problems: remember that since SparseTree is a class template, when using your own SparseTreeNodes, you'll run into linker errors if you don't do an explicit instantiation of SparseTree<YourTreeNode>. To deal with this, you will need to include the Tree source file:

#include "tree/tree.cc"   // note: tree.cc, not .h
And in your code, before you start using the Tree:
template class SparseTree<YourTreeNode>;

You can use built-in iterators to traverse the tree: PreorderIterator, PostorderIterator, and BreadthFirstIterator.The following code gives an example implementation of a tree with an integer at each node and shows how to use a PreorderIterator to traverse the nodes. Note that these iterators are not (yet) compatible with the STL iterator base class, but their interfaces look the same.

#include "tree/sparse-tree-node.h"
#include "tree/sparse-tree.cc"

class MyNode : public SparseTreeNode {
 public:
   MyNode() : SparseTreeNode() {}
   MyNode(const LargeIndex& index) : SparseTreeNode(index), data_(0) {}
   MyNode(int data, const LargeIndex& index) : SparseTreeNode(index), data_(data) {}
   virtual ~MyNode() {}

   // Combining two nodes will just mean adding the numbers.
   virtual void Combine(const SparseTreeNode& other) {
      data_ += (static_cast<const MyNode&>(other)).data_;
   }
   int GetData() const {
      return data_;
   }

 protected:
   virtual void ReadData(istream& input_stream) {
      input_stream.read((char *)&data_, sizeof(int));
   }

   virtual void WriteData(ostream& output_stream) const {
      output_stream.write((char *)&data_, sizeof(int));
   }

 private:
   int data_;
};


template class SparseTree<MyNode>;

// Manually creates a tree that looks like:
/*    5
     / \
    3   2
   /\
  1  9       */
void Test() {
   LargeIndex index;
   MyNode five(5, index);

   index.push_back(0);      // index is now [0]
   MyNode three(3, index);

   index[0] = 1;            // index is now [1]
   MyNode two(2, index);

   index[0] = 0;
   index.push_back(0);      // index is now [0 0]
   MyNode one(1, index);

   index[1] = 1;            // index is now [0 1]
   MyNode nine(9, index);

   SparseTree<MyNode> tree1;
   tree1.add_node(five);  tree1.add_node(three);  tree1.add_node(two);
   tree1.add_node(one);   tree1.add_node(nine);

   // Print out the nodes, depth-first:
   SparseTree<MyNode>::PreorderIterator iter = tree1.BeginPreorder();
   while (iter != tree1.EndPreorder()) {
      printf("%d\n", iter->GetData());
      ++iter;
   }
}
Running Test() in this example will output the following:
5
3
1
9
2

4.3.2 Dense trees

libpmk-2.0 introduces dense trees, which take the place of Tree. As above, they contain nodes, TreeNode and are implemented in much the same way. The same warnings about needing a copy constructor and the use of Combine() apply here. These trees also support the same iterators and have very similar interfaces.

The primary difference is that nodes no longer need to be identified by a LargeIndex specifying the path down a tree; each node can now be uniquely identified with a single integer ID. These IDs are assigned by Tree automatically. The tree can, in O(1) time, give a pointer to a TreeNode given an ID; that TreeNode can then be asked for the IDs of its parent and children. See the reference page for details.


5. Frequently Asked Questions

5.1. How can I improve performance with uniform bins?

5.2. What is the difference between UniformPyramidMaker and VGPyramidMaker? When should I use which?

Using uniform bins has the advantage of being faster (no clustering is needed), but the results are accurate only for low-dimensional features (up to about 10 dimensions). For higher-dimensional data, it is better to use vocabulary-guided bins.

5.3. Why is UniformPyramidMaker slow compared to the previously reported speeds?

The implementation of UniformPyramidMaker is not exactly the same as the one presented by Grauman and Darrell. For flexibility, the uniform-bin and vocabulary-guided pyramids are compatible with one another, meaning that the code that actually computes the pyramid match score is independent of the process that generates the pyramids (whether it is uniform-bin or vocabulary-guided). For uniform-bin pyramids, an explicit tree structure is created (but is not necessary to perform a uniform-bin pyramid match computation), so libpmk's implementation will run slower than the reported time (this issue may be addressed in future releases of libpmk).

5.4. Does this work on Windows? What about Cygwin?

libpmk has only been tested on Linux, so there are no guarantees that it will work (or even compile) on Windows, although for the most part the libpmk code uses standard C++ libraries. There have been reports of getting this to work on Cygwin, however, after two hacks:

5.5. I am having trouble generating my own PointSetList files.

libpmk in the near future will include feature extraction code. Until then, you will need to generate your own .psl files for your own data sets. The easiest way to do so is to instantiate a MutablePointSetList and add data to it using AddPointSet. Important note: when you create new Point objects, everything, including their weight, is initialized to 0. You must call Point::SetWeight(), otherwise the feature vector you want will not count (it will have 0 weight). For details, please see the Point class reference.