A.I. Memos (scanned PDF, image on text) (*)

Show Fewer Details (omit abstracts, authors, where published etc.)

AIM-160 “Focusing”

Author[s]: Berthold K.P. Horn

Date: May 1968

Abstract: This memo describes a method of automatically focusing the new vidisector (TVC). The same method can be used for distance measuring. Included are instructions describing the use of a special LISP and the required LISP-functions. The use of the vidisectors, as well as estimated of their physical characteristics is also included, since a collection of such data has not previously been available.

AIM-164A “The Text-Justifier TJ6”

Author[s]: R. Greenblatt, B.K.P. Horn and L.J. Krakauer

Date: June 1970

Abstract: This memo describes the TJ6 type justifying program, which can be used in the production of memos, such as this one. In addition, Appendices 1, 2, and 3 of this memo contain related information about TECO, the "Selectric" and the type 37 teletype, thus gathering most of the information needed for producing write ups into one location. A sample of input to TJ6 is given in section IV and is in fact the very input used to produce this page of output. The output from TJ6 may be either justified text, with the right margin exactly aligned, as in this introduction, or it may be "filled" text, as in this introduction, with the right margin only approximately aligned. The remainder of this memo will be justified.

AIM-178 “The Image Dissector ‘Eyes’”

Author[s]: B.K.P. Horn

Date: August 1969

Abstract: This is a collection of data on the construction operation and performance of the two image dissector cameras. Some of this data is useful in deciding whether certain shortcomings are significant for a given application and if so how to compensate for them.

AIM-179 “The Arithmetic-Statement Pseudo-Ops: .I and.F”

Author[s]: B.K.P. Horn

Date: August 1969

Abstract: This is a feature of MIDAS which facilitates the rapid writing and debugging of programs involving much numerical calculation. The statements used are ALGOL-like and easy to interpret.

AITR 232 “Shape from Shading: A Method for Obtaining the Shape of a Smooth Opaque Object From One View”

Author[s]: Berthold K. P. Horn

Date: November 1970

Abstract: A method will be described for finding the shape of a smooth apaque object form a monocular image, given a knowledge of the surface photometry, the position of the lightsource and certain auxiliary information to resolve ambiguities. This method is complementary to the use of stereoscopy which relies on matching up sharp detail and will fail on smooth objects. Until now the image processing of single views has been restricted to objects which can meaningfully be considered two-dimensional or bounded by plane surfaces.

It is possible to derive a first-order non-linear partial differential equation in two unknowns relating the intensity at the image points to the shape of the objects. This equation can be solved by means of an equivalent set of five ordinary differential equations. A curve traced out by solving this set of equations for one set of starting values is called a characteristic strip. Starting one of these strips from each point on some initial curve will produce the whole solution surface. The initial curves can usually be constructed around so-called singular points.

A number of applications of this metod will be discussed including one to lunar topography and one to the scanning electron microscope. In both of these cases great simplifications occur in the equations. A note on polyhedra follows and a quantitative theory of facial make-up is touched upon. An implementation of some of these ideas on the PDP-6 computer with its attached image- dissector camera at the Artificial intelligence Laboratory will be described, and also a nose-recognition program.

Published as: Horn, B.K.P., “Determining Shape from Shading,” Chapter 4 in {\it The Psychology of Computer Vision}, Winston, P. H. (Ed.), McGraw-Hill, New York, April 1975, pp. 115–155.

AIM-285 “The Binford-Horn LINE-FINDER”

Author[s]: Berthold K.P. Horn

Date: December 1973

Abstract: This paper briefly describes the processing performed in the course of producing a line drawing from an image obtained through an image dissector camera. The edge-marking pahse uses a non-linear parallel line-follower. Complicated statistical measures are not used. The line and vertex generating phases use a number of heuristics to guide the transition from edge-fragments to cleaned up line-drawing. Higher-level understanding of the blocks-world is not used. Sample line- drawings produced by the program are included.

AIM-295 “On Lightness”

Author[s]: Berthold K.P. Horn

Date: October 1973

Abstract: The intensity at a point in an image is the product of the reflectance at the corresponding object point and the intensity of illumination at that point. We are able to perceive lightness, a quantity closely correlated with reflectance. How then do we eliminate the component due to illumination from the image on our retina? The two components of image intensity differ in their spatial distribution. A method is presented here which takes advantage of this to compute lightness from image intensity in a layered, parallel fashion.

Published as: Horn, B.K.P., “Determining Lightness from an Image,” Computer Graphics and Image Processing, Vol. 3, No. 1, December 1974, pp. 277–299.

AIM-299 “Proposal to ARPA for Research on Intelligent Automata and Micro-Automation”

Author[s]: P. Winston, B.K.P. Horn, G.J. Sussman, et al.

Date: September 1973

Abstract: The results of a decade of work in Artificial Intelligence have brought us to the threshold of a new phase of knowledge-based programming -- in which we can design computer systems that (1) react reasonably to significantly complicated situations and (2) perhaps more important for the future -- interact intelligently with their operators when they encounter limitations, bugs or insufficient information.

This proposal lays out programmes for bringing several such systems near to the point of useful application. These include: A physical "micro-automation" system for maintenance and repair of electronic circuits. A related "expert" problem-solving program for diagnosis and modification of electronic circuits. A set of advanced "Automatic Programming" techniques and systems for aid in developing and debugging large computer programs. Some Advanced Natural Language application methods and sustems for use with these and other interactive projects. A series of specific "expert" problem solvers, including Chess analysis. Steps toward a new generation of more intelligent Information Retrieval and Management Assistance systems.

AIM-323 “Orienting Silicon Integrated Circuit Chips for Lead Bonding”

Author[s]: Berthold K. P. Horn

Date: January 1975

Abstract: Will computers that see and understand what they see revolutionize industry by automating the part orientation and part inspection processes? There are two obstacles: the expense of computin and our feeble understanding of images. We believe these obstacles are fast ending. To illustrate what can be done we describe a working program that visually determines the position and orientation of silicon chips used in integrated circuits.

Published as: Horn, B.K.P., “A Problem in Computer Vision: Orienting Silicon Integrated Circuit Chips for Lead Bonding,” Computer Graphics and Image Processing, Vol. 4, No. 1, September 1975, pp. 294–303.

AIM-335 “Image Intensity Understanding”

Author[s]: Berthold K.P. Horn

Date: August 1975

Abstract: Image intensities have been processed traditionally without much regard to how they arise. Typically they are used only to segment an image into regions or to find edge- fragments. Image intensities do carry a great deal of useful information about three- dimensional aspects of objects and some initial attempts are made here to exploit this. An understanding of how images are formed and what determines the amount of light reflected from a point on an object to the viewer is vital to such a development. The gradient-space, popularized by Huffman and Mackworth is a helpful tool in this regard.

Published as: Horn, B.K.P., “Understanding Image Intensities,” Artificial Intelligence, Vol. 8, No. 2, April 1977, pp. 201–231.

AIM-365 “A Laboratory Environment for Applications Oriented Vision and Manipulation”

Author[s]: Berthold K.P. Horn and Patrick H. Winston

Date: May 1976

Abstract: This report is a brief summary guide to work done in the M.I.T. Artificial Intelligence Laboratory directed at the production of tools for productivity technology research. For detailed coverage of the work, readers should use this summary as an introduction to the reports and papers listed in the bibliography.

AIM-437 “Using Synthetic Images to Register Real Images with Surfaces Models”

Author[s]: Berthold K.P. Horn and Brett L. Bachman

Date: August 1977

Abstract: A number of image analysis tasks can benefit from registration of the image with a model of the surface being imaged. Automatic navigation using visible light or radar images requires exact alignment of such images with digital terrain models. In addition, automatic classification of terrain, using satellite imagery, requires such alignment to deal correctly with the effects of varying sun angle and surface slope. Even inspection techniques for certain industrial parts may be improved by this means.

Published as: Horn, B.K.P. & B.L. Bachman, “Using Synthetic Images to Register Real Images with Surface Models,” Communications of the A.C.M., Vol. 21, No. 11, November 1978, pp. 914–924.

AIM-440 “Density Reconstruction Using Arbitrary Ray Sampling Schemes”

Author[s]: Berthold K.P. Horn

Date: September 1977

Abstract: Methods for calculating the distribution of absorption densities in a cross section through an object from density integrals along rays in the plane of the cross section are well known, but are restricted to particular geometries of data collection. So-called convolutional-backprojection-summation methods, used now for parallel ray data, have recently been extended to special cases of the fan-beam reconstruction problem by the addition of pre- and post-multiplication steps. In this paper, I present a technique for deriving reconstructing algorithms for arbitrary ray- sampling schemes: the resulting algorithms entail the use of a general linear operator, but require little more computation than the convolutional methods, which represent special cases.

Published as: Horn, B.K.P., “Density Reconstruction Using Arbitrary Ray Sampling Schemes,” Proceedings of the IEEE, Vol. 66, No. 5, May 1978, pp. 551–562.

AIM-448 “Fan-beam Reconstruction Methods”

Author[s]: Berthold K. P. Horn

Date: November 1977

Abstract: In a previous paper a technique was developed for finding reconstruction algorithms for arbitrary ray-sampling schemes. The resulting algorithms use a general linear operator, the kernel of which depends on the details of the scanning geometry. Here this method is applied to the problem of reconstructing density distributions from arbitrary fan-beam data.

The general fan-beam method is then specialized to a number of scanning geometries of practical importance. Included are two cases where the kernel of the general linear operator can be factored and rewritten as a function of the difference of coordinates only and the superposition integral consequently simplifies into a convolution integral. Algorithms for these special cases of the fan-beam problem have been developed previously by others. In the general case, however, Fourier transforms and convolutions do not apply, and linear space-variant operators must be used. As a demonstration, details of a fan-beam method for data obtained with uniform ray-sampling density are developed.

Published as: Horn, B.K.P., “Fan-Beam Reconstruction Methods,” Proceedings of the IEEE, Vol. 67, No. 12, December 1979, pp. 1616–1623.

AIM-458 “Configuration Space Control”

Author[s]: Horn, Berthold K.P. and Marc H. Raibert

Date: December 1977

Abstract: Complicated systems with non-linear timevarying behavior are difficult to control using classical linear feedback methods applied separately to individual degrees of freedom. At the present, mechanical manipulators, for example, are limited in their rate of movement by the inability of traditional feedback systems to deal with time-varying inertia, torque coupling effects between links and Coriolis forces. Analysis of the dynamics of such systems, however, provides the basic information needed to achieve adequate control.

Published as: Raibert, M.H. & B.K.P. Horn, “Manipulator Control using the Configuration Space Method,” Industrial Robot, Vol. 4, No. 2, June 1978, pp. 69–73.

AIM-465 “LANDSAT MSS Coordinate Transformations”

Author[s]: Berthold K.P. Horn and Robert J. Woodham

Date: February 1978

Abstract: A number of image analysis tasks require the registration of a surface model with an image. In the case of satellite images, the surface model may be a map or digital terrain model in the form of surface elevations on a grid of points. We develop here an affine transformation between coordinates of Multi- Spectral Scanner (MSS) images produced by the LANDSAT satellites, and coordinates of a system lying in a plane tangent to the earth’s surface near the sub-satellite (Nadir) point.

AIM-467 “Destriping Satellite Images”

Author[s]: B.K.P. Horn and R.J. Woodham

Date: March 1978

Abstract: Before satellite images obtained with multiple image sensors can be used in image analysis, corrections must be introduced for the differences in transfer functions on these sensors. Methods are here presented for obtaining the required information directly from the statistics of the sensor outputs. The assumption is made that the probability distribution of the scene radiance seen by each image sensor is the same. Successful destriping of LANDSAT images is demonstrated.

Published as: Horn, B.K.P. & R.J. Woodham, “Destriping Landsat M.S.S. Images using Histogram Modification,” Computer Graphics and Image Processing, Vol. 10, No. 1, May 1979, pp. 69–83.

AIM-478 “Dynamics of a Three Degree of Freedom Kinematic Chain”

Author[s]: Berthold K.P.Horn, Ken-Ichi Hirokawa and Vijay Vazirani

Date: October 1977

Abstract: In order to be able to design a control system for high-speed control of mechanical manipulators, it is necessary to understand properly their dynamics. Here we present an analysis of a detailed model of a three-link device which may be viewed as either a “leg” in a locomotory system, or the first three degrees of freedom of an “arm” providing for its gross motions. The equations of motion are shown to be non-trivial, yet manageable.

AIM-490 “Determining Shape and Reflectance using Multiple Images”

Author[s]: Berthold K.P. Horn, Robert J. Woodham and William M. Silver

Date: August 1978

Abstract: Distributions of surface orientation and reflectance factor on the surface of an object can be determined from scene radiances observed by a fixed sensor under varying lighting conditions. Such techniques have potential application to the automatic inspection of industrial parts, the determination of the attitude of a rigid body in space and the analysis of images returned from planetary explorers. A comparison is made of this method with techniques based on images obtained from different viewpoints with fixed lighting.

AIM-498 “Calculating the Reflectance Map”

Author[s]: Berthold K.P. Horn and Robert W. Sjoberg

Date: October 1978

Abstract: It appears that the development of machine vision may benefit from a detailed understanding of the imaging process. The reflectance map, showing scene radiance as a function of surface gradient, has proved to be helpful in this endeavor. The reflectance map depends both on the nature of the surface layers of the objects being imaged and the distribution of light sources. Recently, a unified approach to the specification of surface reflectance in terms of both incident and reflected beam geometry has been proposed. The reflectance-distribution function (BRDF). Here we derive the reflectance map in terms of the BRDF and the distribution of source radiance. A number of special cases of practical importance are developed in detail. The significance of this approach to the understanding of image formation is briefly indicated.

Published as: Horn, B.K.P. & R.W. Sjoberg, “Calculating the Reflectance Map,” Applied Optics, Vol. 18, No. 11, June 1979, pp. 1770–1779.

AIM-536 “SEQUINS and QUILLS: Representations for Surface Topography”

Author[s]: Berthold K.P. Horn

Date: May 1979

Abstract: The shape of a continuous surface can be represented by a collection of surface normals. These normals are like a porcupine’s quills. Equivalently, one can use the surface patches on which these normals rest. These in turn are like sequins sewn on a costume. These and other representations for information which can be obtained from images and used in the recognition and description of objects in a scene will be briefly described.

AIM-539 “An Application of the Photometric Stereo Method”

Author[s]: Katsushi Ikeuchi and Berthold K.P. Horn

Date: August 1979

Abstract: The orientation of patches on the surface of an object can be determined from multiple images taken with different illuminations, but from the same viewing position. This method, referred to as photometric stereo, can be implemented using table lookup based on numerical inversion of experimentally determined reflectance maps. Here we concentrate on objects with specularly reflecting surfaces, since these are of importance in industrial applications. Previous methods, intended for diffusely reflecting surfaces, employed point source illumination, which is quite unsuitable in this case. Instead, we use a distributed light source obtained by uneven illumination of a diffusely reflecting planar surface. Experimental results are shown to verify analytic expressions obtained for a method employing three light source distributions.

AIM-572 “Determining Optical Flow”

Author[s]: Berthold K.P. Horn and Brian G. Schunck

Date: April 1980

Abstract: Optical flow cannot be computed locally, since only one independent measurement is available from the image sequence at a point, while the flow velocity has two components. A second constraint is needed. A method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image. An iterative implementation is shown which successfully computes the optical flow for a number of synthetic image sequences. The algorithm is robust in that it can handle image sequences that are quantized rather coarsely in space and time. It is also insensitive to quantization of brightness levels and additive noise. Examples are included where the assumption of smoothness is violated at singular points or along lines in the image.

Published as: Horn, B.K.P. & B.G. Schunck, “Determining Optical Flow,” Artificial Intelligence, Vol. 16, No. 1–3, August 1981, pp. 185–203.

AIM-612 “The Curve of Least Energy”

Author[s]: B.K.P. Horn

Date: January 1981

Abstract: Here we search for the curve which has the smallest integral of the square of curvature, while passing through two given points with given orientation. This is the true shape of a spline used in lofting. In computer-aided design, curves have been sought which maximize “smoothness”. The curve discussed here is the one arising in this way from a commonly used measure of smoothness. The human visual system may use such a curve when it constructs a subjective contour.

Published as: Horn, B.K.P., “The Least Energy Curve,” A.C.M. Transactions on Mathematical Software, Vol. 9, No. 4, December 1983, pp. 441–460.

AIM-654 “Rotationally Symmetric Operators for Surface Interpolation”

Author[s]: Michael Brady and Berthold K.P. Horn

Date: November 1981

Abstract: The use of rotationally symmetric operators in vision is reviewed and conditions for rotational symmetry are derived for linear and quadratic forms in the first and second partial directional derivatives of a function f(x,y). Surface interpolation is considered to be the process of computing the most conservative solution consistent with boundary conditions. The “most conservative” solution is modeled using the calculus of variations to find the minimum function that satisfies a given performance index. To guarantee the existence of a minimum function, Grimson has recently suggested that the performance index should be a semi-norm. It is shown that all quadratic forms in the second partial derivatives of the surface satisfy this criterion. The seminorms that are, in addition, rotationally symmetric form a vector space whose basis is the square Laplacian and the quadratic variation. Whereas both seminorms give rise to the same Euler condition in the interior, the quadratic variation offers the tighter constraint at the boundary and is to be preferred for surface interpolation.

Published as: Brady J.M. & B.K.P. Horn, “Rotationally Symmetric Operators for Surface Interpolation,” Computer Vision, Graphics, and Image Processing, Vol. 22, No. 1, April 1983, pp. 70–94.

AIM-662 “Passive Navigation”

Author[s]: Anna R. Bruss and Berthold K.P. Horn

Date: November 1981

Abstract: A method is proposed for determining the motion of a body relative to a fixed environment using the changing image seen by a camera attached to the body. The optical flow in the image plane is the input, while the instantaneous rotation and translation of the body are the output. If optical flow could be determined precisely, it would only have to be known at a few places to compute the parameters of the motion. In practice, however, the measured optical flow will be somewhat inaccurate. It is therefore advantageous to consider methods which use as much of the available information as possible. We employ a least-squares approach which minimizes some measure of the discrepancy between the measured flow and that predicted from the computed motion parameters. Several different error norms are investigated. In general, our algorithm leads to a system of nonlinear equations from which the motion parameters may be computed numerically. However, in the special cases where the motion of the camera is purely translational or purely rotational, use of the appropriate norm leads to a system of equations from which these parameters can be determined in closed form.

Published as: Bruss, A.R. & B.K.P. Horn, “Passive Navigation,” Computer Vision, Graphics, and Image Processing, Vol. 21, No. 1, January 1983, pp. 3–20.

AIM-726 “Picking up an Object from a Pile of Objects”

Author[s]: Katsushi Ikeuchi, Berthold K.P. Horn, Shigemi Nagata, Tom Callahan and Oded Feingold

Date: May 1983

Abstract: This paper describes a hand-eye system we developed to perform the binpicking task. Two basic tools are employed: the photometric stereo method and the extended Gaussian image. The photometric stereo method generates the surface normal distribution of a scene. The extended Gaussian image allows us to determine the attitude of the object based on the normal distribution. Visual analysis of an image consists of two stages. The first stage segments the image into regions and determines the target region. The photometric stereo system provides the surface normal distribution of the scene. The system segments the scene into isolated regions using the surface normal distribution rather than the brightness distribution. The second stage determines object attitude and position by comparing the surface normal distribution with the extended-Gaussian- image. Fingers, with LED sensor, mounted on the PUMA arm can successfully pick an object from a pile based on the information from the vision part.

AIM-740 “Extended Gaussian Images”

Author[s]: Berthold K.P. Horn

Date: July 1983

Abstract: This is a primer on extended Gaussian Images. Extended Gaussian Images are useful for representing the shapes of surfaces. They can be computed easily from: 1. Needle maps obtained using photometric stereo, or 2. Depth maps generated by ranging devices or stereo. Importantly, they can also be determined simply from geometric models of the objects. Extended Gaussian images can be of use in at least two of the tasks facing a machine vision system. 1. Recognition, and 2. Determining the attitude in space of an object. Here, the extended Gaussian image is defined and some of its properties discussed. An elaboration for non-convex objects is presented and several examples are shown.

Published as: Horn, B.K.P., “Extended Gaussian Images,” Proceedings of the IEEE, Vol. 72, No. 12, December 1984, pp. 1671–1686.

AIM-746 “Picking Parts out of a Bin”

Author[s]: Berthold K.P. Horn and Katsushi Ikeuchi

Date: October 1983

Abstract: One of the remaining obstacles to the widespread application of industrial robots is their inability to deal with parts that are not precisely positioned. In the case of manual assembly, components are often presented in bins. Current automated systems, on the other hand, require separate feeders which present the parts with carefully controlled position and attitude. Here we show how results in machine vision provide techniques for automatically directing a mechanical manipulator to pick one object at a time out of a pile. The attitude of the object to be picked up is determined using a histogram of the orientations of visible surface patches. Surface orientation, in turn, is determined using photometric stereo applied to multiple images. These images are taken with the same camera but differing lighting. The resulting needle map, giving the orientations of surface patches, is used to create an orientation histogram which is a discrete approximation to the extended Gaussian image. This can be matched against a synthetic orientation histogram obtained from prototypical models of the objects to be manipulated. Such models may be obtained from computer aided design (CAD) databases. The method thus requires that the shape of the objects be described, but it is not restricted to particular types of objects.

See also: Horn, B.K.P. & K. Ikeuchi, “The Mechanical Manipulation of Randomly Oriented Parts,” Scientific American, Vol. 251, No. 2, August 1984, pp. 100–111.

AIM-772 “Determining Grasp Points Using Photometric Stereo and the PRISM Binocular Stereo System”

Author[s]: Katsushi Ikeuchi, Keith H. Nishihara, Berthold K.P. Horn, Patrick Sobalvarro and Shigemi Nagata

Date: August 1984

Abstract: This paper describes a system which locates and grasps doughnut shaped parts from a pile. The system uses photometric stereo and binocular stereo as vision input tools. Photometric stereo is used to make surface orientation measurements. With this information the camera field is segmented into isolated regions of continuous smooth surface. One of these regions is then selected as the target region. The attitude of the physical object associated with the target region is determined by histograming surface orientations over that region and comparing with stored histograms obtained from prototypical objects. Range information, not available from photometric stereo is obtained by the PRISM binocular stereo system. A collision-free grasp configuration and approach trajectory is computed and executed using the attitude, and range data.

Published as: Ikeuchi, K., H.K. Nishihara, B.K.P. Horn, P. Sobalvarro & S. Nagata, “Determining Grasp Points using Photometric Stereo and the PRISM binocular stereo system,” Journal of Robotics Research, Vol. 5, No. 1, Spring 1986, pp. 46–65.

AIM-813 “The Variational Approach to Shape From Shading”

Author[s]: Berthold K.P. Horn

Date: March 1985

Abstract: We develop a systematic approach to the discovery of parallel iterative schemes for solving the shape-from-shading problem on a grid. A standard procedure for finding such schemes is outlines, and subsequently used to derive several new ones. The shape-from- shading problem is known to be mathematically equivalent to a non-linear first- order partial differential equation in surface elevation. To avoid the problems inherent in methods used to solve such equations, we follow previous work in reformulating the problem as one of finding a surface orientation field that minimizes the integral of the brightness error. The calculus of variations is then employed to derive the appropriate Euler equations on which iterative schemes can be based.

The problem of minimizing the integral of the brightness error term it ill posed, since it has an infinite number of solutions in terms of surface orientation fields. A previous method used a regularization technique to overcome this difficulty. An extra term was added to the integral to obtain an approximation to a solution that was as smooth as possible.

Published as: Horn, B.K.P. & M.J. Brooks, “The Variational Approach to Shape from Shading,” Computer Vision, Graphics, and Image Processing, Vol. 33, No. 2, February 1986, pp. 174–208.

AIM-820 “Shape and Source From Shading”

Author[s]: Michael J. Brooks and Berthold K.P. Horn

Date: January 1985

Abstract: Well-known methods for solving the shape- from-shading problem require knowledge of the reflectance map. Here we show how the shape-from-shading problem can be solved when the reflectance map is not available, but is known to have a given form with some unknown parameters. This happens, for example, when the surface is known to be Lambertian, but the direction to the light source is not known. We give an iterative algorithm that alternately estimates the surface shape and the light source direction. Use of the unit normal in parameterizing the reflectance map, rather than the gradient or stereographic coordinates, simpliflies the analysis. Our approach also leads to an iterative scheme for computing shape from shading that adjusts the current estimates of the focal normals toward or away from the direction of the light source. The amount of adjustment is proportional to the current difference between the predicted and the observed brightness. We also develop generalizations to less constrained forms of reflectance maps.

AIM-821 “Direct Passive Navigation”

Author[s]: Shahriar Negahdaripour and Berthold K.P. Horn

Date: February 1985

Abstract: In this paper, we show how to recover the motion of an observer relative to a planar surface directly from image brightness derivatives. We do not compute the optical flow as an intermediate step. We derive a set of nine non-linear equations using a least- squares formulation. A simple iterative scheme allows us to find either of two possible solutions of these equations. An initial pass over the relevant image region is used to accumulate a number of moments of the image brightness derivatives. All of the quantities used in the iteration can be efficiently computed from these totals, without the need to refer back to the image. A new, compact notation allows is to show easily that there are at most two planar solutions. Key words: Passive Navigation, Optical flow, Structure and Motion, Least Squares, Planar surface, Non-linear Equations, Dial Solution, Planar Motion Field Equation.

Published as: Horn, B.K.P. & S. Negahdaripour, “Direct Passive Navigation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-9, No. 1, January 1987, pp. 168–176.

AIM-939 “A Direct Method for Locating the Focus of Expansion”

Author[s]: Shahriar Negahdaripour and Berthold K.P. Horn

Date: January 1987

Abstract: We address the problem of recovering the motion of a monocular observer relative to a rigid scene. We do not make any assumptions about the shapes of the surfaces in the scene, nor do we use estimates of the optical flow or point correspondences. Instead, we exploit the spatial gradient and the time rate of change of brightness over the whole image and explicitly impose the constraint that the surface of an object in the scene must be in front of the camera for it to be imaged.

Published as: Negahdaripour, S. & B.K.P. Horn, “A Direct Method for Locating the Focus of Expansion,” Computer Vision, Graphics and Image Processing, Vol. 46, No. 3, June 1989, pp. 303–326.

AIM-994 “Relative Orientation”

Author[s]: Berthold K.P. Horn

Date: September 1987/March 1989

Abstract: Before corresponding points in images taken with two cameras can be used to recover distances to objects in a scene, one has to determine the position and orientation of one camera relative to the other. This is the classic photogrammetric problem of relative orientation, central to the interpretation of binocular stereo information. Described here is a particularly simple iterative scheme for recovering relative orientation that, unlike existing methods, does not require a good initial guess for the baseline and the rotation.

Published as: Horn, B.K.P. “Relative Orientation,” International Journal of Computer Vision Vol. 4, No. 1, pp. 59–78, January 1990.

See also: Horn, B.K.P. “Relative Orientation Revisited,” Journal of the Optical Society of America, A, Vol. 8, pp. 1630–1638, October 1991.

AIM-1071 “Parallel Networks for Machine Vision”

Author[s]: Berthold K.P. Horn

Date: December 1988

Abstract: The amount of computation required to solve many early vision problems is prodigious, and so it has long been thought that systems that operate in a reasonable amount of time will only become feasible when parallel systems become available. Such systems now exist in digital form, but most are large and expensive. These machines constitute an invaluable test- bed for the development of new algorithms, but they can probably not be scaled down rapidly in both physical size and cost, despite continued advances in semiconductor technology and machine architecture. Simple analog networks can perform interesting computations, as has been known for a long time.

We have reached the point where it is feasible to experiment with implementation of these ideas in VLSI form, particularly if we focus on networks composed of locally interconnected passive elements, linear amplifiers, and simple nonlinear components. While there have been excursions into the development of ideas in this area since the very beginnings of work on machine vision, much work remains to be done. Progress will depend on careful attention to matching of the capabilities of simple networks to the needs of early vision. Note that this is not at all intended to be anything like a review of the field, but merely a collection of some ideas that seem to be interesting.

Published as: Horn, B.K.P., “Parallel Analog Networks for Machine Vision,” in Artificial Intelligence at MIT: Expanding Frontiers, edited by Patrick H. Winston and Sarah A. Shellard, MIT Press, Vol. 2, pp. 437–471, 1990.

AIM-1105 “Height and Gradient from Shading”

Author[s]: Berthold K.P. Horn

Date: May 1989

Abstract: The method described here for recovering the shape of a surface from a shaded image can deal with complex, wrinkled surfaces. Integrability can be enforced easily because both surface height and gradient are represented. The robustness of the method stems in part from linearization of the reflectance map about the current estimate of the surface orientation at each picture cell.

The new scheme can find an exact solution of a given shape-from-shading problem even though a regularizing term is included. This is a reflection of the fact that shape-from- shading problems are not ill-posed when boundary conditions are available or when the image contains singular points.

Published as: Horn, B.K.P. “Height and Gradient from Shading,” International Journal of Computer Vision, Vol. 5, No. 1, pp. 37–75, August 1990.

AIM-1526 “Direct Object Recognition Using No Higher Than Second or Third Order Statistics of the Image”

Author[s]: Kenji Nagao and Berthold Horn

Date: December 1995

Abstract: Novel algorithms for object recognition are described that directly recover the transformations relating the image to its model. Unlike methods fitting the typical conventional framework, these new methods do not require exhaustive search for each feature correspondence in order to solve for the transformation. Yet they allow simultaneous object identification and recovery of the transformation. Given hypothesized % potentially corresponding regions in the model and data (2D views) — which are from planar surfaces of the 3D objects — these methods allow direct compututation of the parameters of the transformation by which the data may be generated from the model.

Published as: Nagao, K. & B. K. P. Horn (1995) “Direct Object Recognition Using Lower Order Statistics,” Proc. IROS95, International Conf. on Intelligents Robots and Systems.

AIM-1584 “Edge and Mean Based Image Compression”

Author[s]: Ujjaval Y. Desai, Marcelo M. Mizuki, Ichiro Masaki and Berthold K.P. Horn

Date: November 1996

Abstract: In this paper, we present a static image compression algorithm for very low bit rate applications. The algorithm reduces spatial redundancy present in images by extracting and encoding edge and mean information. Since the human visual system is highly sensitive to edges, an edge-based compression scheme can produce intelligible images at high compression ratios. We present good quality results for facial as well as textured, 256 x 256 color images at 0.1 to 0.3 bpp.

(*) NOTE:

The above are (mostly) derived from scanned A.I. Memo files at ftp://publications.ai.mit.edu/ai-publications/pdf
The quality of recognized text varies with the quality of the originals and the quality of the scanning method. Math is, of course, not properly recognized.

For more details on Published as references, see List of Selected Journal Publications

Berthold K.P. Horn, bkph@ai.mit.edu