AIM-407

Annotated Production Systems: A Model for Skill Acquisition

Author[s]: Ira P. Goldstein and Eric Grimson

Date: February 1977

PS Download: ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-407.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-407.pdf

Abstract: Annotated Production Systems provide a procedural model for skill acquisition by augmenting a production model of the skill with formal commentary describing plans, bugs, and interraltionships between various productions. This commentary supports processes of efficient interpretation, self- debugging and self-improvement. The theory of annotated productions is developed by analyzing the skill of attitude instrument flying. An annotated production interpreter has been written that executes skill models which control a flight simulator. Preliminary evidence indicates that annotated productions effectively model certain bugs and certain learning behaviors characteristic of student pilots.


AIM-510

Differential Geometry, Surface Patches and Convergence Methods

Author[s]: W.E.L. Grimson

Date: February 1979

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-510.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-510.pdf

Abstract: The problem of constructing a surface from the information provided by the Marr-Poggio theory of human stereo vision is investigated. It is argued that not only does this theory provide explicit boundary conditions at certain points in the image, but that the imaging process also provides implicit conditions on all other points in the image. This argument is used to derive conditions on possible algorithms for computing the surface. Additional constraining principles are applied to the problem; specifically that the process be performable by a local-support parallel network. Some mathematical tools, differential geometry, Coons surface patches and iterative methods of convergence, relevant to the problem of constructing the surface are outlined. Specific methods for actually computing the surface are examined.


AIM-565

A Computer Implementation of a Theory of Human Stereo Vision

Author[s]: W.E.L. Grimson

Date: January 1980

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-565.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-565.pdf

Abstract: Recently, Marr and Poggio (1979) presented a theory of human stereo vision. An implementation of that theory is presented and consists of five steps: (1) The left and right images are each filtered with masks of four sizes that increase with eccentricity; the shape of these masks is given by $ abla^{2}G$, the laplacian of a gaussian function. (2) Zero-crossing in the filtered images are found along horizontal scan lines. (3) For each mask size, matching takes place between zero-crossings of the same sign and roughly the same orientation in the two images, for a range of disparities up to about the width of the mask's central region. Within this disparity range, Marr and Poggio showed that false targets pose only a simple problem. (4) The output of the wide masks can control vergence movements, thus causing small masks to come into low resolution to dealing with small disparities at a high resolution. (5) When a correspondence is achieved, it is stored in a dynamic buffer, called the 2 1/2 dimensional sketch. To support the sufficiency of the Marr- Poggio model of human stereo vision, the implementation was tested on a wide range of stereograms from the human stereopsis literature. The performance of the implementation is illustrated and compared with human perception. As well, statistical assumptions made by Marr and Poggio are supported by comparison with statistics found in practice. Finally, the process of implementing the theory has led to the clarification and refinement of a number of details within the theory; these are discussed in detail.


AIM-613

A Computational Theory of Visual Surface Interpolation

Author[s]: W.E.L. Grimson

Date: June 1981

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-613.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-613.pdf

Abstract: Computational theories of structure from motion [Ulman, 1979] and stereo vision [Marr and Poggio, 1979] only specify the computation of three-dimensional surface information at special points in the image. Yet, the visual perception is clearly of complete surfaces. In order to account for this, a computational theory of the interpolation of surfaces from visual information is presented.


AIM-663

The Implicit Constraints of the Primal Sketch

Author[s]: W.E.L Grimson

Date: October 1981

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-663.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-663.pdf

Abstract: Computational theories of structure-from- motion and stereo vision only specify the computation of three-dimensional surface information at points in the image at which the irradiance changes. Yet, the visual perception is clearly of complete surfaces, and this perception is consistent for different observers. Since mathematically the class of surfaces which could pass through the known boundary points provided by the stereo system is infinite and contains widely varying surfaces, the visual system must incorporate some additional constraints besides the known points in order to compute the complete surface. Using the image irradiance equation, we derive the surface consistency constraint, informally referred to as no news is good news. The constraint implies that the surface must agree with the information from stereo or motion correspondence, and not vary radically between these points. An explicit form of this surface consistency constraint is derived, by relating the probability of a zero- crossing in a region of the image to the variation in the local surface orientation of the surface, provided that the surface albedo and the illumination are roughly constant. The surface consistency constraint can be used to derive an algorithm for reconstructing the surface that “best” fits the surface information provided by stereo or motion correspondence.


AIM-666

The Perception of Subjective Surfaces

Author[s]: Michael Brady and W. Eric L. Grimson

Date: November 1981

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-666.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-665.pdf

Abstract: It is proposed that subjective contours are an artifact of the perception of natural three- dimensional surfaces. A recent theory of surface interpolation implies that “subjective surfaces” are constructed in the visual system by interpolation between three-dimensional values arising from interpretation of a variety of surface cues. We show that subjective surfaces can take any form, including singly and doubly curved surfaces, as well as the commonly discussed fronto-parallel planes. In addition, it is necessary in the context of computational vision to make explicit the discontinuities, both in depth and in surface orientation, in the surfaces constructed by interpolation. It is proposed that subjective surfaces and subjective contours are demonstrated. The role played by figure completion and enhanced brightness contrast in the determination of subjective surfaces is discussed. All considerations of surface perception apply equally to subjective surfaces.


AIM-697

Binocular Shading and Visual Surface Reconstruction

Author[s]: W.E.L. Grimson

Date: August 1982

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-697.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-697.pdf

Abstract: Zero-crossing or feature-point based stereo algorithms can, by definition, determine explicit depth information only at particular points on the image. To compute a complete surface description, this sparse depth map must be interpolated. A computational theory of this interpolation or reconstruction process, based on a surface consistency constraint, has previously been proposed. In order to provide stronger boundary conditions for the interpolation process, other visual cues to surface shape are examined in this paper. In particular, it is shown that, in principle, shading information from the two views can be used to determine the orientation of the surface normal along the feature-point contours, as well as the parameters of the reflective properties of the surface material. The numerical stability of the resulting equations is also examined.


AIM-738

Model-Based Recognition and Localization from Sparse Range or Tactile Data

Author[s]: W. Eric L. Grimson and Tomas Lozano-Perez

Date: August 1983

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-738.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-738.pdf

Abstract: This paper discusses how local measurements of three-dimensional positions and surface normals (recorded by a set of tactile sensors, or by three-dimensional range sensors), may be used to identify and locate objects, from among a set of known objects. The objects are modeled as polyhedra having up to six degrees of freedom relative to the sensors. We show that inconsistent hypotheses about pairings between sensed points and object surfaces can be discarded efficiently by using local constraints on: distances between faces, angles between face normals, and angles (relative to the surface normals) of vectors between sensed points. We show by simulation and by mathematical bounds that the number of hypotheses consistent with these constraints is small. We also show how to recover the position and orientation of the object from the sense data. The algorithm’s performance on data obtained from a triangulation range sensor is illustrated.


AIM-744

Constructing a Depth Map from Images

Author[s]: Katsushi Ikeuchi

Date: August 1983

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-744.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-744.pdf

Abstract: This paper describes two methods for constructing a depth map from images. Each method has two stages. First, one or more needle maps are determined using a pair of images. This process employs either the Marr-Poggio-Grimson stereo and shape-from- shading, or, instead, photometric stereo. Secondly, a depth map is constructed from the needle map or needle maps computed by the first stage. Both methods make use of an iterative relaxation method to obtain the final depth map.


AIM-762

Computational Experiments with a Feature Based Stereo Algorithm

Author[s]: W. Eric L. Grimson

Date: January 1984

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-762.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-762.pdf

Abstract: Computational models of the human stereo system can provide insight into general information processing constraints that apply to any stereo system, either artificial or biological. In 1977, Marr and Poggio proposed one such computational model, that was characterized as matching certain feature points in difference-of-Gaussian filtered images, and using the information obtained by matching coarser resolution of representations to restrict the search space for matching finer resolution representations. An implementation of the algorithm and its testing on a range of images was reported in 1980. Since then a number psychophysical experiments have suggested possible refinements to the model and modifications to the algorithm. As well, recent computational experiments applying the algorithm to a variety of natural images, especially aerial photographs, have led to a number of modifications. In this article, we present a version of the Marr-Poggio-Grimson algorithm that embodies these modifications and illustrate its performance on a series of natural images.


AIM-763A

The Combinatorics of Local Constraints in Model-Based Recognition and Localization from Sparse Data

Author[s]: W. Eric L. Grimson

Date: March 1986

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-763a.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-763a.pdf

Abstract: The problem of recognizing what objects are where in the workspace of a robot can be cast as one of searching for a consistent matching between sensory data elements and equivalent model elements. In principle, this search space is enormous and to control the potential combinatorial explosion, constraints between the data and model elements are needed. We derive a set of constraints for sparse sensory data that are applicable to a wide variety of sensors and examine their characteristics. We then use known bounds on the complexity of constraint satisfaction problems together with explicit estimates of the effectiveness of the constraints derived for the case of sparse, noisy three-dimensional sensory data to obtain general theoretical bounds on the number of interpretations expected to be consistent with the data. We show that these bounds are consistent with empirical results reported previously. The results are used to demonstrate the graceful degradation of the recognition technique with the presence of noise in the data, and to predict the number of data points needed in general to uniquely determine the object being sensed.


AIM-841

Recognition and Localization of Overlapping Parts from Sparse Data

Author[s]: W. Eric L. Grimson and Tomas Lozano-Perez

Date: June 1985

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-841.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-841.pdf

Abstract: This paper discusses how sparse local measurements of positions and surface normals may be used to identify and locate overlapping objects. The objects are modeled as polyhedra (or polygons) having up to six degreed of positional freedom relative to the sensors. The approach operated by examining all hypotheses about pairings between sensed data and object surfaces and efficiently discarding inconsistent ones by using local constraints on: distances between faces, angles between face normals, and angles (relative to the surface normals) of vectors between sensed points. The method described here is an extension of a method for recognition and localization of non-overlapping parts previously described in [Grimson and Lozano- Perez 84] and [Gaston and Lozano-Perez 84].


AIM-855

Sensing Strategies for Disambiguating Among Multiple Objects in Known Poses

Author[s]: W. Eric L. Grimson

Date: August 1985

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-855.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-855.pdf

Abstract: The need for intelligent interaction of a robot with its environment frequently requires sensing of the environment. Further, the need for rapid execution requires that the interaction between sensing and action take place using as little sensory data as possible, while still being reliable. Previous work has developed a technique for rapidly determining the feasible poses of an object from sparse, noisy, occluded sensory data. In this paper, we examine techniques for acquiring position and surface orientation data about points on the surfaces of objects, with the intent of selecting sensory points that will force a unique interpretation of the pose of the object with as few data points as possible. Under some simple assumptions about the sensing geometry, we derive a technique for predicting optimal sensing positions. The technique has been implemented and tested. To fully specify the algorithm, we need estimates of the error in estimating the position and orientation of the object, and we derive analytic expressions for such error for the case of one particular approach to object recognition.


AIM-983

On the Recognition of Curved Objects

Author[s]: W. Eric L. Grimson

Date: July 1987

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-983.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-983.pdf

Abstract: Determining the identity and pose of occluded objects from noisy data is a critical part of a system's intelligent interaction with an unstructured environment. Previous work has shown that local measurements of the position and surface orientation of small patches of an object's surface may be used in a constrained search process to solve this problem for the case of rigid polygonal objects using two-dimensional sensory data, or rigid polyhedral objects using three-dimensional data. This note extends the recognition system to deal with the problem of recognizing and locating curved objects. The extension is done in two dimensions, and applies to the recognition of two-dimensional objects from two-dimensional data, or to the recognition of three-dimensional objects in stable positions from two- dimensional data.


AIM-985

On the Recognition of Parameterized Objects

Author[s]: W. Eric L. Grimson

Date: October 1987

PS Download: ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-985.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-985.pdf

Abstract: Determining the identity and pose of occluded objects from noisy data is a critical step in interacting intelligently with an unstructured environment. Previous work has shown that local measurements of position and surface orientation may be used in a constrained search process to solve this problem, for the case of rigid objects, either two-dimensional or three-dimensional. This paper considers the more general problem of recognizing and locating objects that can vary in parameterized ways. We consider objects with rotational, translational, or scaling degrees of freedom, and objects that undergo stretching transformations. We show that the constrained search method can be extended to handle the recognition and localization of such generalized classes of object families.


AIM-1019

The Combinatorics of Object Recognition in Cluttered Environments Using Constrained Search

Author[s]: W. Eric L. Grimson

Date: February 1988

PS Download: ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1019.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1019.pdf

Abstract: When clustering techniques such as the Hough transform are used to isolate likely subspaces of the search space, empirical performance in cluttered scenes improves considerably. In this paper we establish formal bounds on the combinatorics of this approach. Under some simple assumptions, we show that the expected complexity of recognizing isolated objects is quadratic in the number of model and sensory fragments, but that the expected complexity of recognizing objects in cluttered environments is exponential in the size of the correct interpretation. We also provide formal bounds on the efficacy of using the Hough transform to preselect likely subspaces, showing that the problem remains exponential, but that in practical terms, the size of the problem is significantly decreased.


AIM-1044

On the Sensitivity of the Hough Transform for Object Recognition

Author[s]: W. Eric L. Grimson and David Huttenlocher

Date: May 1988

PS Download: ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1044.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1044.pdf

Abstract: A common method for finding an object's pose is the generalized Hough transform, which accumulates evidence for possible coordinate transformations in a parameter space and takes large clusters of similar transformations as evidence of a correct solution. We analyze this approach by deriving theoretical bounds on the set of transformations consistent with each data- model feature pairing, and by deriving bounds on the likelihood of false peaks in the parameter space, as a function of noise, occlusion, and tessellation effects. We argue that blithely applying such methods to complex recognition tasks is a risky proposition, as the probability of false positives can be very high.


AIM-1110

On the Verification of Hypothesized Matches in Model-Based Recognition

Author[s]: W. Eric L. Grimson and Daniel P. Huttenlocher

Date: May 1989

PS Download: ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1110.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1110.pdf

Abstract: In model-based recognition, ad hoc techniques are used to decide if a match of data to model is correct. Generally an empirically determined threshold is placed on the fraction of model features that must be matched. We rigorously derive conditions under which to accept a match, relating the probability of a random match to the fraction of model features accounted for, as a function of the number of model features, number of image features and the sensor noise. We analyze some existing recognition systems and show that our method yields results comparable with experimental data.


AIM-1111

The Combinatorics of Heuristic Search Termination for Object Recognition in Cluttered Environments

Author[s]: W. Eric L. Grimson

Date: May 1989

PS Download: ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1111.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1111.pdf

Abstract: Many recognition systems use constrained search to locate objects in cluttered environments. Earlier analysis showed that the expected search is quadratic in the number of model and data features, if all the data comes from one object, but is exponential when spurious data is included. To overcome this, many methods terminate search once an interpretation that is "good enough" is found. We formally examine the combinatorics of this, showing that correct termination procedures dramatically reduce search. We provide conditions on the object model and the scene clutter such that the expected search is quartic. These results are shown to agree with empirical data for cluttered object recognition.


AIM-1226

The Effect of Indexing on the Complexity of Object Recognition

Author[s]: W. Eric L. Grimson

Date: April 1990

PS Download: ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1226.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1226.pdf

Abstract: Many current recognition systems use constrained search to locate objects in cluttered environments. Previous formal analysis has shown that the expected amount of search is quadratic in the number of model and data features, if all the data is known to come from a sinlge object, but is exponential when spurious data is included. If one can group the data into subsets likely to have come from a single object, then terminating the search once a "good enough" interpretation is found reduces the expected search to cubic. Without successful grouping, terminated search is still exponential. These results apply to finding instances of a known object in the data. In this paper, we turn to the problem of selecting models from a library, and examine the combinatorics of determining that a candidate object is not present in the data. We show that the expected search is again exponential, implying that naïve approaches to indexing are likely to carry an expensive overhead, since an exponential amount of work is needed to week out each of the incorrect models. The analytic results are shown to be in agreement with empirical data for cluttered object recognition.


AIM-1250

Affine Matching with Bounded Sensor Error: A Study of Geometric Hashing and Alignment

Author[s]: W. Eric L. Grimson, Daniel P. Huttenlocher and David W. Jacobs

Date: August 1991

PS Download: ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1250.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1250.pdf

Abstract: Affine transformations are often used in recognition systems, to approximate the effects of perspective projection. The underlying mathematics is for exact feature data, with no positional uncertainty. In practice, heuristics are added to handle uncertainty. We provide a precise analysis of affine point matching, obtaining an expression for the range of affine-invariant values consistent with bounded uncertainty. This analysis reveals that the range of affine- invariant values depends on the actual $x$- $y$-positions of the features, i.e. with uncertainty, affine representations are not invariant with respect to the Cartesian coordinate system. We analyze the effect of this on geometric hashing and alignment recognition methods.


AIM-1362

Recognizing 3D Ojbects of 2D Images: An Error Analysis

Author[s]: W. Eric Grimson, Daniel P. Huttenlocher and T. D. Alter

Date: July 1992

PS Download: ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1362.ps.Z

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1362.pdf

Abstract: Many object recognition systems use a small number of pairings of data and model features to compute the 3D transformation from a model coordinate frame into the sensor coordinate system. With perfect image data, these systems work well. With uncertain image data, however, their performance is less clear. We examine the effects of 2D sensor uncertainty on the computation of 3D model transformations. We use this analysis to bound the uncertainty in the transformation parameters, and the uncertainty associated with transforming other model features into the image. We also examine the impact of the such transformation uncertainty on recognition methods.


AIM-1435

Why Stereo Vision is Not Always About 3D Reconstruction

Author[s]: W. Eric L. Grimson

Date: July 1993

PS Download: ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1435.ps.Z

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1435.pdf

Abstract: It is commonly assumed that the goal of stereovision is computing explicit 3D scene reconstructions. We show that very accurate camera calibration is needed to support this, and that such accurate calibration is difficult to achieve and maintain. We argue that for tasks like recognition, figure/ground separation is more important than 3D depth reconstruction, and demonstrate a stereo algorithm that supports figure/ground separation without 3D reconstruction.


AIM-1463

Object Recognition By Alignment Using Invariant Projections of Planar Surfaces

Author[s]: Kanji Nagao and W. Eric L. Grimson

Date: February 1994

PS Download: ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1463.ps.Z

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1463.pdf

Abstract: In order to recognize an object in an image, we must determine the best transformation from object model to the image. In this paper, we show that for features from coplanar surfaces which undergo linear transformations in space, there exist projections invariant to the surface motions up to rotations in the image field. To use this property, we propose a new alignment approach to object recognition based on centroid alignment of corresponding feature groups. This method uses only a single pair of 2D model and data. Experimental results show the robustness of the proposed method against perturbations of feature positions.


AIM-1523

Recognizing 3D Object Using Photometric Invariant

Author[s]: Kenji Nagao and Eric Grimson

Date: April 22, 1995

PS Download: ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1523.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1523.pdf

Abstract: In this paper we describe a new efficient algorithm for recognizing 3D objects by combining photometric and geometric invariants. Some photometric properties are derived, that are invariant to the changes of illumination and to relative object motion with respect to the camera and/or the lighting source in 3D space. We argue that conventional color constancy algorithms can not be used in the recognition of 3D objects. Further we show recognition does not require a full constancy of colors, rather, it only needs something that remains unchanged under the varying light conditions sand poses of the objects. Combining the derived color invariants and the spatial constraints on the object surfaces, we identify corresponding positions in the model and the data space coordinates, using centroid invariance of corresponding groups of feature positions. Tests are given to show the stability and efficiency of our approach to 3D object recognition.


AIM-1662

Co-dimension 2 Geodesic Active Contours for MRA Segmentation

Author[s]: Liana M. Lorigo, Olivier Faugeras, W.E.L. Grimson, Renaud Keriven, Ron Kikinis, Carl-Fredrik Westin

Date: August 11, 1999

PS Download: ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1662.ps

PDF Download: ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1662.pdf

Abstract: Automatic and semi-automatic magnetic resonance angiography (MRA)s segmentation techniques can potentially save radiologists larges amounts of time required for manual segmentation and cans facilitate further data analysis. The proposed MRAs segmentation method uses a mathematical modeling technique whichs is well-suited to the complicated curve-like structure of bloods vessels. We define the segmentation task as ans energy minimization over all 3D curves and use a level set methods to search for a solution. Ours approach is an extension of previous level set segmentations techniques to higher co-dimension.