feature extraction

rubine's feature set

As described earlier, the classifiers that were evaluated in this project used Rubine's 13 features for constructing feature vectors. See Figure 6 of "Specifying Gesture by Example" for an illustration of the features described below.

dt feature extractors and corresponding rubine features

dt extractor rubine extractor description

distfv f8 Total euclidean distance traversed by a stroke

bboxfv2 f3 Dimensions of bounding box of a stroke

f4fv f4 Stroke bounding box aspect ratio

startendfv f5 Distance between starting and ending point of a stroke

f1fv f1, f2, f6, f7 Sine and cosine of start and end angles of a stroke

f9fv f9 Sum of angle traversed by stroke

f10fv f10 Sum of absolute angle tranversed by stroke ("curviness")

f11fv f11 Sum of squared angles traversed by stroke ("jaggedness")

f12fv f12 Maximum instantaneous velocity within a stroke

f13fv f13 Total time duration of a stroke

*dt feature extractors and corresponding rubine features*
dt extractor	rubine extractor	description
distfv	f8	Total euclidean distance traversed by a stroke
bboxfv2	f3	Dimensions of bounding box of a stroke
f4fv	f4	Stroke bounding box aspect ratio
startendfv	f5	Distance between starting and ending point of a stroke
f1fv	f1, f2, f6, f7	Sine and cosine of start and end angles of a stroke
f9fv	f9	Sum of angle traversed by stroke
f10fv	f10	Sum of absolute angle tranversed by stroke ("curviness")
f11fv	f11	Sum of squared angles traversed by stroke ("jaggedness")
f12fv	f12	Maximum instantaneous velocity within a stroke
f13fv	f13	Total time duration of a stroke

generalizing to multistroke gestures

Each of the Rubine extractors above operated on a stroke by stroke basis, summarizing various geometric and velocity properties about the stroke in a single scalar value. To build a feature vector out of these features for a unistroke classifier, then one can choose an arbitrary fixed ordering for these scalar values and assemble them into a single feature vector. However, generalizing this to multistroke gestures is a bit more complicated.

feature-aligned sparse feature vectors

The approach taken in this project is for techniques that that require fixed size feature vectors, such as k-nearest neighbors, fisher linear discriminants, neural nets or support vector machines (as opposed to sequential models that operate on a stroke by stroke basis, such as markov chains), is to first determine the maximum size of the feature vector needed to characterize all the strokes of the longest example across all the classes. For gesture examples that have the same number of strokes, features are populated for the strokes in the same order that they were performed. For gesture examples that have fewer strokes, the procedure is the same except that features are aligned to ensure that each dimension of the vector corresponds to the same features as the other examples.

A potential drawback to this approach is that, if there are a comparatively few number of examples with many strokes versus fewer strokes, these feature vectors will turn out sparse. As the dimensionality is increased, the space also becomes increasingly sparse relative to the number of examples. We hoped to determine whether this would present any problems with respect to scalability or distinguishability of classes with any of the classifiers. In practice, however, as described in results, most of the classifiers were still able to distinguish among classes well, despite this disadvantage.

[ back to dt ]