Visual Object Instance Recognition

Ali Rahimi, Shen-Hui Lee
20 Aug 2007

In the not too distance future every object will be instrumented with an RFID tag, and most, if not all of computer vision will become obsolete. In the mean time, computer vision scientists can still make a living with object recogntion.

Here are a few videos showcasing a real-time vision-based object recognition system we are developing at the Intel Research Seattle lablet. The system runs at about 15 frames per second and can be trained to recognize dozens of objects.

(many thanks to Anthony LaMarca's daughter for lending me the puppets)

The eventual goal of this project is to recognize hundreds of thousands of objects on handheld devices. At these scales, our devices will be able to recognize every object we are likely to encounter (an average person deals with only about 500 objects in their household. Interesting building facades, faces of acquaintances, posters, restaurant menus, and the variety of other objects in our non-domestic environment adds a few orders of magnitude). The system is trained by giving it a examples of each object under various poses and lighting conditions.

Prototypes of various objects
The bulldog
The frog
Ali

The algorithm identifies objects by measuring the similarity between an incoming image and images in a large collection of training images that are labeled with the objects they represent. To compute the similarity between two images, a bipartite matching algorithm identifies descriptive patches that correspond across the images. These similarities are then passed to a classifier, which labels the incoming image. The classification engine is based on a simple trick for training support vector machines using similarity functions that are not positive definite.