\documentclass[times, 10pt,twocolumn]{article} 
\usepackage{latex8}
\usepackage{graphicx}
\usepackage{amsmath}
\usepackage{times}

%\documentstyle[times,art10,twocolumn,latex8]{article}

%------------------------------------------------------------------------- 
% take the % away on next line to produce the final camera-ready version 
\pagestyle{empty}

%------------------------------------------------------------------------- 
\begin{document}

\title{A Recognizer for Free-Hand Graph Drawings}

\author{\small Hamdi Dibeklioglu\\
Bogazici University\\ 
Dept. of Computer Eng.\\
%Dept. of Comp. Eng. \\
Istanbul, Turkey\\
%Hamdi.Dibeklioglu@cmpe.boun.edu.tr\\
hamdimail@gmail.com \\
\and
Tevfik Metin Sezgin\\
University of Cambridge\\ 
Computer Laboratory\\
Cambridge, UK \\
Metin.Sezgin@cl.cam.ac.uk\\
\and
Ender Ozcan\\
Yeditepe University\\ 
Dept. of Computer Eng.\\
%Dept. of Comp. Eng. \\
Istanbul, Turkey\\
eozcan@cse.yeditepe.edu.tr\\
}

\maketitle
\thispagestyle{empty}

\begin{abstract}
Interactive multimedia such as computer simulations and animations received
increased attention over the years as supplementary teaching tools and have
now become integral components of most engineering and science curriculums.
We believe one way to boost the utility of such simulations and animations
is to make them easier to use. In this paper, we describe a pen-based 
interface for constructing weighted and unweighted graphs which functions 
as a front-end to a shortest path algorithm simulator that we have developed.
We compare the usability of this pen-based interface to that of a WIMP-based 
interface and a hybrid interface that uses a mixture of pen-based and 
WIMP-based input methods.  We report evaluation results on the 
usability of the three types of interfaces that we have developed. Our results
show that a purely pen-based approach to graph creation may actually suffer 
from the relatively higher misrecognition rates of the harder-to-recognize
symbols and inhibit usability. We conclude that a hybrid approach which 
combines a soft-keyboard with graph recognition outperforms interfaces that
are purely pen-based or WIMP-based.

% We report on our findings on the 
% We allow the graph structure and the weights to be entered using 
% traditional WIMP-based interfaces or through ink-recognition. 




\end{abstract}



%------------------------------------------------------------------------- 
\Section{Introduction}
Interactive multimedia such as computer simulations and animations received
increased attention over the years as supplementary teaching tools and have
now become integral components of most engineering and science curriculums.

Although there is a wealth of resources for interactive animations,
these applications usually employ traditional WIMP-based interfaces 
regardless of the appropriateness of such interfaces for the task at hand.
For example, a quick web search for shortest path graph algorithms returns
over a hundred applets/applications with WIMP-based interfaces despite the
fact that drawing a graph -- the first step of interaction for all these 
applications -- could naturally be done using a pen-based interface due to 
its diagrammatic nature. 

Pen-based interfaces are not only better suited 
for this task, but also make it possible to interact with the application in question
in a relatively standard diagrammatic language unlike their WIMP counterparts. 
For example, quick examination of the WIMP-based shortest-path animation
applications reveals that they show considerable variation on how the
user is expected to construct the graphs. For adding nodes to the graph 
some applications use buttons with images, text, or both. Some use pull-down 
menus, some use a drag-and-drop interface while others use a combination of 
choice menus and buttons. Textual labels associated with the GUI elements also
show variation. For example, the words ``node/vertex,'' and
``edge,'' ``link,'' ``arc,'' ``line'' and ``connector'' are used 
interchangeably although they conceptually refer to the same thing and have
standard graphical/visual representations. 

The above observations give us reasonable support to believe that an 
intelligent pen-based interface which interprets free-hand pen input would be 
preferred over traditional WIMP-based interfaces for constructing graphs in 
our target domain of shortest path graphs. The usefulness of such a system will
depend on many factors including the degree to which we can correctly interpret
free-hand ink, the perceived ease-of-use for the users, the efficiency of the 
input method and the overall satisfaction of the users. To answer these
questions, we have built three interfaces for constructing graphs:

\begin{enumerate}
\item{{\em a pen-based interface} where the graph structure and weights are entered
using free-hand ink}
\item{{\em a WIMP-based PowerPoint-like interface} for graph construction and a 
soft-keyboard for inputting edge weights}
\item{{\em a hybrid interface} using pen input for graph construction and a 
soft-keyboard for inputting edge weights}
\end{enumerate}

All of these three interfaces function as front-ends to a shortest path 
algorithm simulator that we have developed. 
In the rest of this paper, we describe the architecture of our application 
including a summary of each front-end interface in detail. % We also 
%report quantitative results on the accuracy of our ink-recognition methods. 
We also report evaluation results on the usability of each 
interface. Our results show that the benefits of
a purely pen-based approach to graph creation may actually be undermined
by the relatively poor recognition rates of the harder-to-recognize symbols 
and inhibit usability. We conclude with a discussion of the related and 
future work.
%There is evidence that animations are useful teaching tools in general, and 
%for algorithms in particular. There are all there applets on the web for 
%simulating graph algorithms. They all have their own interface, and their own 
%vocabulary, but we already know how to draw graphs.
%------------------------------------------------------------------------- 


\section{Application Architecture}
We have designed our application to have a clear separation between the graph 
construction module and the algorithm animation module. We use a graph 
adjacency matrix representation to interface the front-end and the 
animation modules. This makes it easy to extend the system by adding a new 
input method or a new animation module.

\SubSection{Graph Construction Front-Ends}
\subsubsection{Ink recognition method}
The ink-based graph construction method has three components that respectively
support graph structure recognition, digit recognition, and editing. The control
flow chart in Fig.~\ref{control-flow} shows how the graph structure recognition
and the editing modules fit together. The digit recognition component is
a separate module.

\begin{figure}
\centerline{
\hbox{
	\includegraphics[width=3.35in]{figures/rec_architecture2}
}}
\vspace{-.15in}
\caption{Flowchart for the ink recognition process.}
\label{control-flow}
\end{figure}

\vspace{.1in}
\noindent
{\em \bf Recognizing the graph structure}

\noindent
As seen in Fig.~\ref{control-flow}, the graph structure is constructed by
recognizing circles and lines (for nodes and edges) and points (for entering
the edge-weight editing mode). We only support single stroke objects. So we 
assume that the users draw nodes and links using single strokes, and also 
write digits in single strokes.  

\vspace{.1in}
\noindent{\em \bf Node and edge recognition}

\noindent
In our pen-based interface, nodes and edges are recognized by fitting circles
and lines to the input stroke, and choosing the interpretation that has a 
lower least squares error. 

To find a line fit, we use the first and last points in the stroke,
$(x_1,y_1)$ and $(x_n,y_n)$, to derive the line equation $ax+by+c=0$ using 
the following equation of a line with two known points:

$$(y-y_1)(x_n-x_1)=(y_n-y_1)(x-x_1)$$

To fit a circle, we find the rectangular bounding box with corner positions
$(x_{min}, y_{min})$ and $(x_{max},y_{max})$ for the stroke at hand where
$x_{min}, y_{min}, x_{max}$ and $y_{max}$ are the minimum and maximum 
values for the $x$ and $y$ positions of the points in the stroke. Then 
the equation $(x-h)^2+(y-k)^2=r^2$ gives us the circular fit to the stroke 
where $(h,k)$ is the center of the circle and $r$ is the radius and

$$h = \frac{x_{min}+x_{max}}{2},$$
$$k = \frac{y_{min}+y_{max}}{2}.$$

After the line and circle fits are computed, we compute the distance of each 
point $(p_x,p_y)$ in the stroke to the line and circle fits using the 
following error functions: 

$$Error_{line}(p_x,p_y)= \left|\frac{a p_x + b p_y + c}{\sqrt{a^2+b^2}}\right|,$$
$$Error_{circle}(p_x,p_y)= \sqrt{(h-p_x)^2+(k-p_y)^2} - r.$$

We compute the average fitting error for circle and line fits using all 
points in the stroke and classify it to have the label of the fit with the lowest
average error. Circles indicate that the user has drawn a node, and lines 
indicate an edge. If the end points of the stroke lie on previously drawn 
nodes, then those nodes become the source and destination nodes. If the user
is drawing a directed graph, the direction of the stroke also determines
the direction of the edge. If either of the stroke's end-points don't lie 
inside a node, then the stroke is ignored.

If the recognized shape is a circle that surrounds a set of nodes, then it 
is interpreted as a selection operation and passed onto the graph editing 
module. 

\vspace{.1in}
\noindent
{\em \bf Handwritten digit recognition}

\noindent
Digit recognition allows the edge weights to be specified by pen input.
We have implemented two digit recognition methods. The first one 
uses Kohonen networks \cite{kohonen}. The other recognition method is based
on the iterative closest point \cite{ICP_algorithm} and parallel sampling
algorithms, and we describe it in detail in \cite{hamdi_thesis}. The 
method based on Kohonen networks has outperformed the latter algorithm and
it was our algorithm of choice for the version of the system used in 
the evaluation. 

\begin{figure}
\centerline{
\begin{tabular}{c}
	\includegraphics[scale=.5]{figures/digit_delete} \\
	(a)\\
	\includegraphics[scale=.5]{figures/digit_ok}\\
	(b)
\end{tabular}}
\caption{Gestures for indicating deletion (a), and the completion of the
edge-weight specification.}
\label{gestures}
\end{figure}

In addition to digits, we have trained the system to recognize two simple 
gestures as shown in Fig.~\ref{gestures}. A leftward dash deletes the last 
digit that was recognized, and a checkmark signals that the user is done 
entering the current weight. 

The users can edit their graphs using the pen interface. Nodes are 
selected by drawing a closed loop around them. Once the nodes are selected,
the selection can be dragged around to move the nodes. Drawing a line 
across the selection deletes the selected nodes and any edges connected to
them. Tapping the weight of an edge with the stylus puts the system into 
edge-weight editing mode where the user can update the weight value using 
the pen interface.

\subsubsection{Hybrid Method}
The hybrid method uses the graph recognition method described above for
specifying the graph structure. We use a soft-keyboard for entering the
edge weights. Fig.~\ref{soft-keyboard} shows the layout of our soft-keyboard.
It works much like the soft-keyboard included with the Windows XP Tablet PC
version. The user taps in the keys with a stylus or another pointing device
to enter the edge weights. Editing in the hybrid method works the same way
as in the ink recognition method, except the edge weights are entered 
using the soft keyboard.

\begin{figure}
\centerline{
	\includegraphics[scale=.5]{figures/soft-keyboard}
}
\caption{The soft-keyboard interface used for entering edge weights in the
hybrid and WIMP-based interfaces.}
\label{soft-keyboard}
\end{figure}


\subsubsection{WIMP-based method}
Fig.~\ref{WIMP} shows our WIMP based interface.
The WIMP based method uses image-buttons with textual labels for constructing
the graph structure. The image-buttons form a toolbar, and the user selects
the operation that needs to be performed by clicking on the appropriate button
in the toolbar. This interface is representative of most WIMP-based tools 
such as MS PowerPoint. As it was the case for the hybrid method, the edge
weights are entered using the soft-keyboard interface. The graph editing 
operations are selected using the toolbar.

\begin{figure}
\centerline{
\includegraphics[width=3.5in]{figures/WIMP}
}
\caption{The WIMP-based interface.}
\label{WIMP}
\end{figure}

\subsection{Graph Animation Module}
All three interfaces have a "RUN" button that the user can click or tap
to launch the graph algorithm simulation module once the graph construction
step is completed. Fig.~\ref{screenshot} shows a snapshot of the 
graph animation module simulating Dijkstra's shortest path algorithm on 
a problem specified by the user. As seen in the figure, the graph is 
color coded, and there is an English annotation of the algorithm's current
status displayed in the bottom of the window. We support all combinations of
directed/undirected and weighted/unweighted graphs and have implemented 
simulation modules for Dijkstra's and Kruskall's shortest path algorithms.

\begin{figure}
\centerline{
\includegraphics[width=3.5in]{figures/screenshot}
}
\caption{The graph animation module showing the simulation of the Dijkstra's
algorithm on a problem specified by the user.}
\label{screenshot}
\end{figure}

\Section{Evaluation} To find out how each of the three interfaces compare, 
we have conducted a user study. Among the goals of the user study were 
to determine which interface would best suit users who were somewhat 
knowledgeable in graph theory. 

For the user study, we recruited 10 students who had recently taken an 
undergraduate algorithms course. One subject was left-handed, while the rest 
were right-handed. 

The study consisted of three parts. In part 1, the subjects were shown a
weighted directed graph with five nodes and 10 edges. They were asked to:
\begin{itemize}
\item{construct the graph}\vspace{-.1in}
\item{remove three edges} \vspace{-.1in}
\item{remove a node and all the edges connected to that node}\vspace{-.1in}
\item{edit the weights of three edges}\vspace{-.1in}
\item{move two of the nodes}
\end{itemize}

In part 2, the subjects were shown a weighted undirected graph with nine
nodes and 13 edges. They were asked to:
\begin{itemize}
\item{construct the graph}\vspace{-.1in}
\item{remove three nodes and all connected edges} \vspace{-.1in}
\item{move the nodes around to obtain a specified layout}\vspace{-.1in}
\end{itemize}

In part 3, we asked the subjects to draw an undirected graph depicting 
the relative positions of five cities that the subjects were familiar with.
They were then asked to add at least seven edges to their graph labeled 
by the pairwise distance between the source and destination cities to the
best of their knowledge. The goal of part 3 was to have the subjects draw
a graph from memory, as opposed to copying one that they were shown. 

The subjects were asked to complete all three parts using the pen-based,
hybrid and the WIMP-based interfaces in an arbitrary order. For each task in 
the three parts, we also measured the time spent for the task as well as
the number of recognition/input errors and the number of corrections 
the users had perform. At the end of the study, the subjects have completed 
a survey. 

We have observed that 8 of the participants drew the weighted graphs starting 
with the vertices first. After drawing all vertices, they filled in the
edges. On average, all parts were completed in an hour. Based on the 
average time spent, the hybrid method ranked first and the pen-based 
method ranked last.


\begin{figure*}
 \begin{center}
   \begin{tabular}{l||c|c|c||}
     \cline{2-4}
              & \multicolumn{3}{|c|}{Input Method}\\
     \cline{2-4}
                                                       &\scriptsize Pen-based     &\scriptsize Hybrid     &\scriptsize WIMP \\
     \hline
     \multicolumn{1}{|l||}{\small 1.	The functions I expected to complete the tasks were available}              &\scriptsize 2.40 &\scriptsize 1.60 &\scriptsize 2.40 \\
     \hline
     \multicolumn{1}{|l||}{\small 2.	The interface was intuitive} 	                         &\scriptsize2.40	 &\scriptsize1.50	 &\scriptsize 2.70 \\
     \hline
     \multicolumn{1}{|l||}{\small 3.	I was satisfied with how the interface worked} 	 &\scriptsize2.40	 &\scriptsize1.30	 &\scriptsize 2.20\\
     \hline
     \multicolumn{1}{|l||}{\small 4.	The interface was simple to use} 	         &\scriptsize 2.50	 &\scriptsize 1.30	 &\scriptsize 2.30\\
     \hline
     \multicolumn{1}{|l||}{\small 5.	I could effectively complete the tasks} 	 &\scriptsize2.60	 &\scriptsize1.90	 &\scriptsize2.60 \\
     \hline
     \multicolumn{1}{|l||}{\small 6.	I could complete the tasks quickly} 	 &\scriptsize3.10	 &\scriptsize1.60	 &\scriptsize2.70 \\
     \hline
     \multicolumn{1}{|l||}{\small 7.	I could complete the tasks efficiently} 	 &\scriptsize2.90	 &\scriptsize1.70	 &\scriptsize2.60 \\
     \hline
     \multicolumn{1}{|l||}{\small 8.	I thought the "look" of interface was pleasant} 	         &\scriptsize2.20	 &\scriptsize1.50	 &\scriptsize2.90\\
     \hline
     \multicolumn{1}{|l||}{\small 9.	Drawing the nodes and the edges was easy}	 &\scriptsize1.80	 &\scriptsize1.00	 &\scriptsize2.10\\
     \hline
     \multicolumn{1}{|l||}{\small 10.	Entering the weights was easy}	         &\scriptsize4.90	 &\scriptsize2.00	 &\scriptsize2.20\\
     \hline
     \multicolumn{1}{|l||}{\small 11.	I could easily recover from errors} 	                 &\scriptsize2.40	 &\scriptsize1.70	 &\scriptsize1.80\\
     \hline
     \multicolumn{1}{|l||}{\small 12.	I really liked using the GUI.}	                         &\scriptsize3.00	 &\scriptsize1.30	 &\scriptsize2.80\\
     \hline
     \multicolumn{1}{|l||}{\small average score}                                                      &\scriptsize2.72	 &\scriptsize1.53	 &\scriptsize2.44\\
     \hline
   \end{tabular}
 \end{center}
 \vspace{-.15in}
 \caption{Aggregated results for the exit survey given to the users. The first column lists a number of statements. The users were asked to assign scores to each interface based on how much they agreed with the statements.}
\label{results-table}
\end{figure*}

Fig.~\ref{results-table} lists the questions that the subjects answered
at the end of the study. They were asked to assign a score 1-7 for each 
question, based on how much they agreed with each statement. The scores
were defined as: 1 {strongly agree}, 2  {somewhat agree}, 
3 {slightly agree}, 4 {neural}, 5 {slightly disagree}, 
6 {disagree}, 7 {strongly disagree}. We applied the Friedman's non-parametric
test with a degree of freedom 2 to evaluate the consistency of the ratings 
for the GUIs. The results show that the changes obtained for the statements 
\#6, \#7, \#10 and \#12 were significant within a confidence 
interval of 95\%. In summary, the participants preferred the hybrid interfaces
over the other two and they felt it allowed them to complete the tasks more efficiently.
The task completion times that we have measured, which are summarized in 
Fig.~\ref{statistics-table}, agree with this statement. 
Furthermore, the users found it easier to enter the weights with 
the hybrid method. Again, the figures summarized in Fig.~\ref{statistics-table}
support this. The table also shows that the pen-based interface had
the most number of errors during drawing and weight entry. 

Finally, although the ratings for statement \#9 in Fig.~\ref{results-table} 
were not consistent across users, the average rating is better for the hybrid
and pen-based interfaces compared to the WIMP-based interface. It 
seems that with a better digit recognizer, a purely pen-based approach 
would have been preferred. 

\begin{figure*}
 \begin{center}
   \begin{tabular}{l||c|c|c|c|c|c|c|c|c||}
     \cline{2-10}
              & \multicolumn{3}{|c|}{Part 1}& \multicolumn{3}{|c|}{Part 2}& \multicolumn{3}{|c|}{Part 3}\\
     \cline{1-10}
\multicolumn{1}{|l||}{Input Method}	&{\scriptsize Pen-based}&{\scriptsize Hybrid}&{\scriptsize WIMP}&{\scriptsize Pen-based}&{\scriptsize Hybrid}&{\scriptsize WIMP}&{\scriptsize Pen-based}&{\scriptsize Hybrid}&{\scriptsize WIMP}\\
     \hline
\multicolumn{1}{|l||}{$\mu$}   		&0.8	&0.9	&0.6	&0.4	&0.7	&0.3	&0.3	&0.2	&0.5 \\
     \hline
\multicolumn{1}{|l||}{$\sigma$}   	&0.63	&0.99	&0.84	&0.70	&0.67	&0.67	&0.67	&0.42	&0.53 \\
     \hline
   \end{tabular}\\
	Statistics for the number of errors made during graph structure specification.
 \end{center}
 \begin{center}
   \begin{tabular}{l||c|c|c|c|c|c|c|c|c||}
     \cline{2-10}
              & \multicolumn{3}{|c|}{Part 1}& \multicolumn{3}{|c|}{Part 2}& \multicolumn{3}{|c|}{Part 3}\\
     \cline{1-10}
\multicolumn{1}{|l||}{Input Method}	&{\scriptsize Pen-based}&{\scriptsize Hybrid}&{\scriptsize WIMP}&{\scriptsize Pen-based}&{\scriptsize Hybrid}&{\scriptsize WIMP}&{\scriptsize Pen-based}&{\scriptsize Hybrid}&{\scriptsize WIMP}\\
     \hline
\multicolumn{1}{|l||}{$\mu$}   		&2.90	&1.00	&1.20	&3.40	&1.10	&1.00	&3.00	&0.40	&0.70 \\
     \hline
\multicolumn{1}{|l||}{$\sigma$}   	&1.29	&1.41	&1.14	&1.51	&1.10	&1.15	&0.94	&0.52	&1.25 \\
     \hline
   \end{tabular}\\
	Statistics for the number of errors made during weight specification.
 \end{center}
 \begin{center}
   \begin{tabular}{l||c|c|c|c|c|c|c|c|c||}
     \cline{2-10}
              & \multicolumn{3}{|c|}{Part 1}& \multicolumn{3}{|c|}{Part 2}& \multicolumn{3}{|c|}{Part 3}\\
     \cline{1-10}
\multicolumn{1}{|l||}{Input Method}	&{\scriptsize Pen-based}&{\scriptsize Hybrid}&{\scriptsize WIMP}&{\scriptsize Pen-based}&{\scriptsize Hybrid}&{\scriptsize WIMP}&{\scriptsize Pen-based}&{\scriptsize Hybrid}&{\scriptsize WIMP}\\
     \hline
\multicolumn{1}{|l||}{$\mu$}   		&268.4	&217.3	&252.7	&276.1	&191.4	&250.1	&230.1	&140.3	&183.1\\
     \hline
\multicolumn{1}{|l||}{$\sigma$}   	&101.71	&56.88	&54.70	&77.67	&40.37	&86.07	&69.82	&54.63	&67.23\\
     \hline
   \end{tabular}\\
	Statistics for the amount of time spent on each of the three parts in seconds.
 \end{center}
 \caption{Mean ($\mu$) and standard error ($\sigma$) for the number of errors made during graph structure specification,
weight entry and the amount of time spent for each part by the subjects.}
\label{statistics-table}
\end{figure*}



\Section{Related Work}
Previous research in interpreting free-hand pen input can be categorized as
online methods that interpret ink as it is laid on the drawing surface, and
offline systems that interpret previously recorded drawings. Our system 
uses online recognition.

SILK and DENIM \cite{sis} are amongst the earliest tools to demonstrate the 
use of intelligent processing of free-hand input in enhancing user interfaces.
Work in \cite{kara_sr,saund_pccw,heloise,szummer,zeleznik-music-notepad,viola_shilman,sezgin_cga,tracy_uml,gross} and \cite{christine_uist} describe sketch 
recognition systems for a number of domains including electrical circuit diagrams, 
UML diagrams, graphs and family trees. Among these systems, some put more
emphasis on the usability and user interface issues of their application, while 
others aim to push forward the state of the art in sketch recognition. 
From this perspective, our work is closer to the first group of work, because
our main contribution is our user study which compares three different
user interfaces of varying degrees of intelligent ink processing to achieve 
the same task. Rather than focusing on developing the best recognition techniques,
we have explored the trade-off between the added convenience of doing intelligent
ink-processing and the inconvenience created by misrecognitions. We believe we 
have identified a mixture of pen-based and WIMP-based input methods that 
significantly outperform either method alone. 

\Section{Discussion and Future Work}
We have presented a pen-based interface for constructing weighted and 
unweighted graphs which functions as a front-end to a shortest path 
algorithm simulator that we have developed. We compare the usability 
of this pen-based interface to that of a WIMP-based interface and a 
hybrid interface that uses a mixture of pen-based and WIMP-based input 
methods.  Our evaluation of the three types of interfaces has revealed
that the hybrid interface is preferred by users over only pen-based
and only WIMP-based input methods. We believe the main reason for this 
is the high error rates for digit recognition as indicated by values
in Fig.~\ref{statistics-table}. It
is very likely that a combination of the relatively low resolution 
sampling rates for pen-based input and the kinesthetic properties 
of writing on a capturing device as opposed to paper is responsible 
for the low recognition rates. This suggests that building robust 
user independent digit recognizers as a critical research problem despite
the high recognition rates obtained for scanned images of digits. 



%------------------------------------------------------------------------- 
\bibliographystyle{latex8}
\bibliography{latex8}

\end{document}
