[ back to dt ]

feature extraction

rubine's feature set

As described earlier, the classifiers that were evaluated in this project used Rubine's 13 features for constructing feature vectors. See Figure 6 of "Specifying Gesture by Example" for an illustration of the features described below.

dt feature extractors and corresponding rubine features
dt extractorrubine extractordescription
distfvf8 Total euclidean distance traversed by a stroke
bboxfv2f3Dimensions of bounding box of a stroke
f4fvf4Stroke bounding box aspect ratio
startendfvf5Distance between starting and ending point of a stroke
f1fvf1, f2, f6, f7Sine and cosine of start and end angles of a stroke
f9fvf9Sum of angle traversed by stroke
f10fvf10Sum of absolute angle tranversed by stroke ("curviness")
f11fvf11Sum of squared angles traversed by stroke ("jaggedness")
f12fvf12Maximum instantaneous velocity within a stroke
f13fvf13Total time duration of a stroke

generalizing to multistroke gestures

Each of the Rubine extractors above operated on a stroke by stroke basis, summarizing various geometric and velocity properties about the stroke in a single scalar value. To build a feature vector out of these features for a unistroke classifier, then one can choose an arbitrary fixed ordering for these scalar values and assemble them into a single feature vector. However, generalizing this to multistroke gestures is a bit more complicated.

feature-aligned sparse feature vectors

The approach taken in this project is for techniques that that require fixed size feature vectors, such as k-nearest neighbors, fisher linear discriminants, neural nets or support vector machines (as opposed to sequential models that operate on a stroke by stroke basis, such as markov chains), is to first determine the maximum size of the feature vector needed to characterize all the strokes of the longest example across all the classes. For gesture examples that have the same number of strokes, features are populated for the strokes in the same order that they were performed. For gesture examples that have fewer strokes, the procedure is the same except that features are aligned to ensure that each dimension of the vector corresponds to the same features as the other examples.

A potential drawback to this approach is that, if there are a comparatively few number of examples with many strokes versus fewer strokes, these feature vectors will turn out sparse. As the dimensionality is increased, the space also becomes increasingly sparse relative to the number of examples. We hoped to determine whether this would present any problems with respect to scalability or distinguishability of classes with any of the classifiers. In practice, however, as described in results, most of the classifiers were still able to distinguish among classes well, despite this disadvantage.

[ back to dt ]