Engineering AI

Clarity of purpose

There have been several recent posts discussing high-level methodology in AI, including the degree to which we should build in human knowledge or emulate what is known about human computational mechanisms and whether our horizon should be short or long term. These are interesting conversations, but it seems to me that they would be much more constructive if people would start by saying what their objectives as AI researchers are.

Some of us are interested in research that might lead to practical results in a 2 or 10 or 50-year time frame. Some of us are interested in understanding the intelligence of humans and other natural systems, at the level of neural computation or at the level of behavior and cognition. Some of us are interested in developing mathematical and computational theories of complex "intelligent" behavior that can explain present and future natural and artificial intelligent agents. All of these enterprises are exciting and worthy of pursuit. They are different ultimate objectives that may have significant overlap in shorter-term objectives and methods, including algorithms, theory, and computational hardware. However, fundamentally different objectives are almost certain to demand fundamentally different solutions.

I think it's critical for any of us, before making assertions about the best way to proceed, on any problem ranging from the highest level methodological choices to the most mundane hardware or algorithmic details, to state our objectives. Only in light of stated objectives can any of these conversations be constructive. So, for example, it's entirely reasonable for AI researchers interested in human cognitive processing to be unmoved by the achievements of Deep Blue or Alpha Go on the same day that those interested in engineering approaches are delighted by its success. It's fine for those trying to field an autonomous vehicle two years from now to encode their own human understanding of the driving problem into their systems at the same time that others are inventing methods for evolving systems that discover convolutional structure.

Engineering intelligence

I'll put my personal cards on the table: I would like to develop a general theory and practical methodology for designing the software aspects of intelligent robots. It's okay with me if they don't align with what we know about natural systems, but also fine if they do! I'm not concerned with developing something in the immediate term, although I have to confess to a certain impatience for some observable results in my lifetime. Many of these same considerations apply to intelligent systems that are not physically embodied or that we would not generally call robots, but my focus is on physical robots.

I think about the problem in terms of a decision-theoretic robot software factory. The factory is given a "spec" for a robot, which defines the robot's hardware capabilities for sensing and acting, some characterization of a distribution over the possible domains that the robot might be placed in, and an objective measure of the robot's behavior in that domain. For example, an assembly-line robot might be required to operate in a very narrowly specified range of conditions with performance measured in terms of efficiency and accuracy in achieving some particular assembly. A household-helper robot, on the other hand, might be required to operate in a wide variety of homes and interact with a wide variety of human families, each of which has detailed preferences and demands, and its performance might be measured entirely in terms of the satisfaction of its human owners. One extreme instance of this problem would correspond to the human intelligence setting: the hardware is similar to the human body, the distribution over domains is a broad range of situations that a human might find themselves in, in our natural world, and the objective might be a long, happy life.

These scenarios differ dramatically in the type of solution they require, but not in the way we frame the problem of specifying and evaluating the robot's behavior. The problem of a robot intelligence engineer is, given a specification (robot hardware, domain distribution, robot behavior evaluation metric), to implement software to put in the robot so that it will perform well in expectation.

What's important about the decision-theoretic view of robot software engineering is that we no longer need to debate the best form of the robot software, at least from an input/output perspective. The program should be fast if the domain requires it, the program should learn if the domain requires it, the program should be able to explain its reasoning if the domain requires it.

It's also important to note that just because the engineers view a system in decision-theoretic terms, the system itself doesn't have to reason about states, actions, rewards, or probabilities. We engineers may adopt a formal framework for analysis, even if it's not obviously embodied in the artifact we are analyzing.

In a way, we have completely specified a software engineering problem: we have a spec and need to produce a piece of software. The difficulty of this engineering problem depends on the "distance" between the form of the specification and the form of the software that's necessary. Here are some examples that are useful to think about:

  1. The objective is a trajectory. The solution is a list of position commands and a robot controller that can follow them.
  2. The objective is to play a winning game of Chess. The solution is a policy mapping board positions to moves.
  3. The objective is to be a good robot companion to a family. The solution is a program that gathers information about the house and family and improves its ability to please the household over time, based on its experience.

Engineering methodology

By taking this view, we have framed a very hard problem for the engineers: how in the world are we to find the best program for a robot, given a spec?

There are many possible reasonable strategies here; I think they are all somewhat plausible and that the choice is not technically clear, making this actually a sensible matter to debate, and to agree to disagree on, at least until we collectively understand the problem better.

Recapitulate evolution.
If we could make an enormous set of simulations (or real-world "padded rooms" for real robots) that replicate the set of possible households a robot might end up in, we could try to make search or learning algorithms that determine, based on a huge number of experiments, the best software to build into the robot. This is an interesting scientific enterprise, and in principle, a solution to our problem. A concern is that this strategy is infeasible, in terms of the sheer amount of exploration and computation required.
Reverse-engineer humans.
If we could figure out the set of built in structures, reflexes, algorithms, and organizing principles in a human brain, we could try to put that same stuff into the head of our robot. This is an important scientific enterprise, but it also might take a long time! In addition, it would tell us about humans, but perhaps not the general principles of a more general notion of intelligence, and would not necessarily show us how to customize the built-in structures for robots of different types in different domain "niches".
Be clever engineers.
We could just try to examine the world we live in and the set of domains we expect our robots to be able to perform in, and work hard to directly build the robot's software. This is the approach that AI and robotics have taken for years. There has definitely been considerable success, but also dramatic failure (does anyone remember expert systems?). One major difficulty is that, although humans may be very good at some tasks (for example, visual scene analysis) we are completely unable to articulate or introspect into the nature of the computational processing that underlies our most of our competence. I fear we engineers just don't know enough, or have big enough brains, to do this job directly.

A middle way

So, how should we proceed? It would be more dramatic to take a radical position here, but I think the preceding radical positions are unlikely to be effective. I'm putting my money on the following strategy, although there is plenty of room for opinion and disagreement here.

We need to develop a general-purpose engineering methodology for crafting special-purpose intelligent robot programs. We will need to wield the tools of our trade that can take a specification and find a solution--evolution, gradient descent, combinatorial search, probabilistic inference--to "solve" for robot programs given specifications.

But, of course, the same trade-off plays out for the engineers as it does for the robots: the bigger the space of possible solutions, the exponentially more data and computation is needed to find a good one. Ultimately, to solve problems in our lifetimes, we will have no choice but to build in some of our "best guess" priors, based on insights from understanding natural intelligence, from the design of algorithms and data structures, and from fundamental physical principles.

I will finish out this post by discussing the dual role of machine learning, from this perspective.

Learning in the factory and in the wild

It is clear that machine learning will play a critical role in the development of intelligent systems, for almost every objective and almost every path we might take to pursue that objective. And, of course, we are in the midst of an explosion of machine learning technique, applications, funding, enthusiasm, and hype. In the midst of this storm, it's even harder than usual to step back and be clear about the objective, strengths, and weaknesses of one's work.

So, I'd like to be clear about different roles that machine learning will play in the robot-factory view. Although there are many more subtle but important distinctions, let's focus on two importantly different opportunities to apply machine learning techniques in the construction of intelligent robots: learning in the factory and learning in the wild.

Learning in the wild is the machine learning that an individual robot does, once it is delivered from the factory into the actual environment it will be operating in. It will have to adapt to the immediate conditions, potentially ranging from the frictions in its joints to the organization of the kitchen cupboards in its house. It may have to continue to adapt as these things change over the course of its deployment. This kind of adaption requires some combination of techniques that we describe as "perception", "state estimation", and "learning." It's important to note that, generally, the objective function will be to perform as well as possible throughout the time that the robot is learning, making it critical to manage the trade-off between exploration and exploitation carefully, and to put a premium on strategies that are highly sample-efficient.

Learning in the factory is the machine learning that is done in the factory, to make up for the inadequacies in knowledge or cognitive capacity of the engineers. This learning may simply fill gaps in the engineers' knowledge: they might use learning methods to build a good face-detector that will be fielded directly in the wild. In other cases, it may be meta-learning, in the sense that the objective is to learn (or derive or search for) software strategies that will perform well by doing a good job of learning in the wild. The objective function for learning in the factory may de-emphasize sample complexity: experience in the factory is, in a sense, amortized over all the robots that factory will design. It may also be more tolerant of exploration, because we may design the factory environment to be all or partly in simulation or in physical circumstances in which real damage is less likely to occur than in the real world.

Although learning in the factory and learning in the wild share theoretical underpinnings and may share some technical solutions, it is critical for us, as researchers, to be clear about the setting in which we are studying learning: a great solution for one setting might be disastrous in another, and there's no point in argumentation about whether a method is better or not, without putting it clearly into context.

We can use this lens to study, for example, the utility of "end-to-end" learning; that is, using learning methods to monolithically adapt the behavior of an entire robot program with a single objective function rather than to adapt sub-modules with intermediate objectives. It is a sensible strategy to use in the factory: the engineers could construct an infrastructure that performs end-to-end adaptation of a robot's policy to optimize the stated specification until it is ready to go out into the wild. We might have technical debate about the degree to which adding intermediate objective functions will speed up learning and/or degrade its results, but both are sensible engineering strategies. What about end-to-end learning in the wild? Its utility is not a topic for debate: it is an objective fact about the specification of the robot's software whether end-to-end learning is the best structure for the software that will run on the robot in the wild.

General-purpose structural biases for learning

Everyone engaged in the use of machine-learning methods, in the factory or the wild, must grapple with the trade-off between bias and sample complexity. If data and computation were free, we could perhaps avoid "tainting" our robots with any possibly incorrect human preconceptions, and roughly fall back to unguided search through possible robot programs in the order of complexity (which is of course also a human construct). Assuming that data and computation are not free, our strategy for finding a practical solution rests on the judicious incorporation of bias.

The kind of bias we speak of here is the good kind: relatively general structures, algorithms, and reflexes that can be built in, in advance, that will afford much more sample-efficient learning. Note that these biases are critically important to robots in the wild, but I would argue that they are at least as important (though different) for machine-learning processes in the factory.

What makes a bias be helpful and not harmful? It should express some constraint that is present in all of (or with high probability in each of) the problems that this learning system might need to address. A favorite example is convolutional neural networks. Implicit in the structure and parameter-tying of a CNN are constraints that arise from the imaging process and from the underlying distribution of natural images: there tends to be spatial locality and translation independence in the patterns that are important for many perceptual tasks.

We must focus on deriving more such useful structural biases. They may take the form of neural network architectures (such as graph neural networks or memory organizations), of algorithms (such as trajectory optimization or forward-search planning), or of some kinds of simple reflexes. They may be derived from an understanding of natural intelligence (e.g., Spelke's "core knowledge" of human infants) or the physical world (e.g., the relationship between force, mass, acceleration, friction, velocity, and position) or algorithmic insight (forward search or message passing or Monte-Carlo inference).

In my dreams, I look forward to a situation in which engineers in the robot factory could take a spec, use software tools to assemble an architecture based on an appropriate set of structural biases, construct a collection of simulated and real environments, and use machine-learning methods to arrive at a great initial program for a robot, which would be delivered to my house and then seamlessly and quickly learn and adapt to become a useful and interesting member of the household.

Acknowledgments

The idea of general-purpose mechanisms for constructing special-purpose systems, and many others lurking in here, are due to Tomas Lozano-Perez. The idea of taking a meta-programming view of building AI systems is due to Stuart Russell. The idea of taking a utility-theoretic stance toward robot design came to me via Stan Rosenschein and Dan Dennett. My views have been tempered by (sometimes heated!) discussions with many colleagues and students, including Rich Sutton and Rod Brooks. Thanks to Tom Silver, Rohan Chitnis, and Tomas Lozano-Perez for helpful comments on this post.