[ back to dt ]

training set

The training set consisted of 10 examples each of 45 passdoodle classes. For the purposes of this experiment, I chose the japanese basic phonetic character set hiragana. Each passdoodle, which were each comprised of between one and five strokes, was captured with a time limit of 5 seconds. A custom application called CaptureApp was written in Java to listen to mouse events, capture them and write them to the chosen file format.

Data was generated manually by the author using a 3M MicroTouch USB 17-inch touchscreen which had a resolution of 1280x1024 pixels. Due to the nature of Java and the underlying (non-realtime) operating system platform, the sampling rate varied but averaged 85Hz.

This character set had a number of properties that made it convenient to use. First, as with all japanese character sets, the correct stroke order is well-defined, coinciding with the assumption made at the beginning of the project that stroke order would be preserved. Second, the number and length/complexity of strokes varies by approximately the amount we would expect to be drawn in under five seconds. One limitation, however, is that there are not many characters in the hiragana set that have sharp edges, possibly due to its being designed to be easily written with brush and ink. My original intent was to include the modern katakana set as well, which has characters with sharper edges, but ran short on time.

data format

[classname] [x-resolution] [y-resolution]
[stroke-number] [data-element-number] [x-coord] [y-coord] [time-in-milliseconds]

for example:

hira-chi 1280.0 977.0
0 0 350 388 1101872336430
0 1 357 388 1101872336450
0 2 360 388 1101872336450
1 0 549 207 1101872336860
1 1 549 215 1101872336870
1 2 549 220 1101872336870
1 3 548 235 1101872336880
1 4 548 237 1101872336890

download

[ download training set (450 examples) ] [ download test set (225 examples) ]

plots of training set

[ back to dt ]