|This image shows a scene from a simulated environment, where the agent has just moved one step forward from its initial starting position.|
This second image shows the displacement of some selected points (a dense matrix of evenly spaced pixels) in the image plane - superimposed on the first image - that is caused by the motion of the agent. We color coded the flow vectors (also called "needles") so that their special attributes can easily be visualized. The meaning that the colors carry will be described after the algorithm is explained.
The method that we are using to obtain the optical flow field is basically as follows  :
First, a gaussian filter is applied to the raw input
images. This is a low pass filter, and has a blurring (smoothing) effect
on the image. Then, laplacian filter is applied to obtain the second derivative
information from the images. Theoretically, both of these filtering operations
are 2-D convolutions, but practically we implement them as two 1-D and
one 2-D convolutions. The same effect of applying a 2-D, gaussian filter
- an NxN square matrix - is obtained in two 1-D steps (which helps us reduce
the number of necessary operations from N² to 2N+1), and then the
laplacian is applied as usual. The combined effect of these two filters
are referred to as a LoG filter (Laplacian of Gaussian). The result of
this operation is the detection of the edges in the images. After the LoG
filter is applied to an image, the zero crossings of the intensity values
show the position of the edges. Therefore, it suffices to look at the sign
changes to detect the edges. We then produce binary sign of laplacian
of gaussian (SLOG) images by using the sign information. The following
figure shows two successive frames from an image sequence, the effects
of the filters, and the binary SLOG images obtained at the end of the described
Once we have two successive binary SLOG images, in order to find the displacements of features, we apply a procedure called patch matching  ; For each needle of the flow field, a patch centered around the origin of that needle in the first image is taken. Then this patch is compared with all of the same sized patches that have their centers in a search area in the second image. The search area is a rectangle whose center has the same coordinates with the origin of the needle, and whose sizes can be adjusted on the fly. If the number of the matching pixels in two patches are above some (percentage) threshold, then the two patches are considered to match. The vector defined by the needle origin and the center of the best matching patch is the displacement that we were trying to find.
There are also some special cases that may arise during the matching process. We are also keeping records of them (for possible future use). These cases, and the color codes that are used in the images for visualization are the followings:
Note that almost all of the needles on the ground
are non-match cases. This is because the texture on the ground is too fine-grained
and gets lost in the gaussian filtering.
If the agent wants to avoid obstacles, then it should
turn away from the side that shows more motion, since this indicates a
possible approach to a stationary obstacle. An example of this case is
shown in the figure below:
|In this case, the sum of the magnitudes of the flow vectors on the left side is greater than the right side sum. Therefore the agent should turn right to avoid the obstacle.|
A simple modification makes use of this strategy in escaping from enemies. The agent simply moves backward, while keeping the enemy in focus as in trailing a target.
Back to top
| This is a snapshot of our program in action
computing time to contact estimations. The image sequence used is prerecorded
by our robot while it was approaching a wall covered with newspapers with
a speed of 8 cm/sec (The newspapers provide a texture that is more adequate
for patch matching than the white, textureless wall). There are 76 frames
in this sequence.
In order to reduce the effect of input noise, or the errors that may occur due to the integer nature of the needle lengths, our program averages over a large number of estimates to determine the overall single result.
|This graph shows each of the time to contact estimations made by the above run of our program for all of the successive frame pairs in the input stream.|
|We use Crystal Space, a free 3-D engine as the basis for our simulated environments. It is developed in C++. For more information about Crystal Space, click on the logo, or follow the link below:|
|Our robot, Erik, is an indoor robot (RWI - B21r) equipped with a digital 1/2" CCD color camera with digital signal processing capabilities. You can learn more about Erik by clicking on his images or by clicking the following link:|
|For more information about our camera, click on the image or follow the link below:|
Back to top
|Erik is finding its way between chairs in our laboratory using the balance strategy for obstacle avoidance.||Erik is running down the corridor, again using the balance strategy. Note that this video is produced with a frame rate of 3 frames per second, therefore please adjust your mpeg player program accordingly to see the real motion.|
|Virtual Erik is wandering in a simulated environment created by Crystal Space engine.||Another simulated environment. Virtual Erik is passing through a door, and following a corridor.|
Back to top
Back to top