Why I am Optimistic

Patrick H. Winston

I believe that we can discover the computational basis of natural intelligence during the next ten years or so.

Several sources of optimism contribute to this optimistic view, and support the collective view expressed by the authors of materials to be found in The Human Intelligence Enterprise.

The Paradigm shift

From the engineering perspective, Artificial Intelligence is a grand success. Programs with roots in Artificial Intelligence research perform feats of mathematical wizardry, act as genetic counselors, schedule gates at airports, and extract useful regularities from otherwise impenetrable piles of data.

From the scientific perspective, however, not so much has been accomplished, and the goal of understanding intelligence, from a computational point of view, remains elusive. Reasoning programs still exhibit little or no common sense. Today's language programs translate simple sentences into database queries, but those language programs are derailed by idioms, metaphors, convoluted syntax, or ungrammatical expressions. Today's vision programs recognize engineered objects, but those vision programs are easily derailed by faces, trees, and mountains.

Why so litte progress? Since the field of Artificial Intelligence was born in the 1960s, most of its practitioners have believed—or at least acted as if they have believed—that vision, language, and motor faculties are the I/O channels of human intelligence. They believe that if we are to account for intelligence, we have to understand the reasoning engine that stands behind those faculties. Some suggest that people interested in vision, language, and motor issues should attend their own conferences, lest the value of Artificial Intelligence conferences be diminished by irrelevant distractions. Some write textbooks that devote no space whatsoever to vision, language, and motor topics.

Of course, one could argue that 30 years is not much time for a science to develop. It might be that vision, language, and motor faculties indeed are just I/O channels, and another 30 years, or another 300 years, will be required to develop the theory needed to understand what lies behind them. It is hard to argue against such a more-time-needed view, because it is easy to fall into what Seymour Papert once called the unthinkability fallacy: you can't do it that way because I cannot think how anyone could do it that way.

To me, there is an attractive alternative: I believe that our intelligence is in our I/O channels, not behind them, and if we are to understand intelligence, we must understand the contributions of vision, language, and motor faculties.

Evidence from watching the brain at work

Using Functional Magnetic Resonance Imaging and Positron Emission Tomography, researchers can determine which brain areas show increased energy consumption as you think various sorts of thoughts. If you throw a ball, your cerebellum lights up. If you watch the ball fly, your occipital lobe lights up. And if you hear the ball land, your temporal lobe lights up.

Nothing in those observations surprises. All are in accordance with classical theories of brain function. You hit perceptual apparatus with stimuli, and peripheral parts of your brain process information.

Remarkably, however, those same peripheral parts of your brain can light up without the benefit of perceptual stimulation. For example, if you close your eyes, and someone asks you to think about an alphabetic character, your occipital lobe lights up in a way reminiscent of what happens when you look at an alphabetic character. And if someone asks you to think of a verb that goes with the noun hammer, and you think of pound, several parts of your brain associated with language understanding light up, as well as the the right side of your cerebellum. What? Your cerebellum? That part of your brain that sits on top of your spinal cord is supposed to be for fine motor control. What is it doing lighting up during what might seem to be a task for the language faculty alone?

Such experiments force the conclusion that vision, language, and motor areas—once viewed as exclusively for processing external stimuli—are involved in just plain thinking.

Evidence from neuroanatomy

Of course, it is not really surprising that much of the peripheral brain lights up when you think. Neuroanatomists have found much evidence to the effect that for every bundle of nerve fibers projecting one way, there is a complementary set that projects the other way.

Of course, projections toward peripheral parts of the brain might be there only to control or tune the processing done by those peripheral parts. Such explanation does not easily account for the number of fibers projecting toward the peripheral parts of the brain. There are so many such fibers, you cannot avoid thinking that the peripheral brain is not only heavily used in perception, but also heavily reused in just plain thinking.

Inspiration from armchair psychology

Visual problem solving

Ask a small child to add 2 and 2; she will convert the problem into a visual finger-counting exercise. Ask an adult to name the 10th letter in the alphabet; he will hold up his hand and start counting, just as if he were a child adding. Ask a physics student to solve a problem; he will draw a diagram.

There is no doubt about it: vision makes it possible to solve problems that would otherwise be difficult or impossible. To be sure, visual problem solving is not the only kind, but it is hard to imagine how we can understand how the brain thinks unless we understand how it sees.

Linguistic problem solving

Danny Hillis once asked me if I ever had the experience of explaining an idea to someone, only to have the idea misunderstood into an idea that was actually better. Sure, I replied, it happens every time I try to explain something to Marvin Minsky.

Danny's point, of course, was that the inner conversation many (all?) people have when they solve problems may play the same role as a conversation with someone else. Processing thoughts expressed as word sequences must excite important thinking mechanisms buried in our language-processing hardware. Thus, the thinking lies in the language-processing hardware, not behind it.

Inspiration from landmark papers

One way to explain a point of view is to cite a set of landmark papers that seminally point the way. Unfortunately, the position I take is relatively new and relatively rare, at least to me. Accordingly, only a handful of papers seem seminal, and they really do not point the way in the absence of explanation. So far, those papers are the following:

  • Marvin Minsky: "K-lines: A Theory of Memory," Cognitive Science, vol. 4, no. 1, 1980. Many of the ideas also are to be found in Minsky's book, Society of Mind, Simon & Schuster, 1985.

Minsky argued that thinking is largely a matter of reusing states previously established by perception.

  • Shimon Ullman: "Sequence Seeking and Counter Streams," Chapter 10 in Ullman's book, High Level Vision, MIT Press, 1996.

Ullman offers a theory of visual recognition in which alternative transformations produce two highly branching search trees that extend toward each other from new perceptions and from a previously-stored model. When the bi-directional search produces a match, the perceived object is recognized. His paper draws some of its inspiration from the established bi-directional nature of neural connections.

  • Kenneth Yip and Gerald Sussman: "A Computational Model for the Acquisition and Use of Phonological Knowledge," widely circulated, but as yet unpublished manuscript.

Yip and Sussman exhibit a theory of how certain English phonological rules might be learned and put to use by mechanisms that make heavy use of a bi-directional constraint propagation mechanism. Their paper argues for the central importance of sparsely populated phonological spaces, and may shed light on why acoustic signals seem to become words, and more generally, on why signals at the neural level seem to correspond to symbols at the thought level.

Sudden revolution or another brick wall

I do not believe that the I/O channel view equates to believing that understanding intelligence is a simple matter. The computations that produce intelligence are surely sophisticated, and unlikely to be understood via incremental progress.

Nevertheless, I believe progress in Artificial Intellgence, based on the view that our intelligence is in the I/O, will be either rapid or nonexistent; it cannot be slow. I believe that today's reasons for optimism either will fuel a revolution, or those reasons will play out and prove unworthy.

Thus, after five to ten years, if there is no substantial progress toward understanding intelligence, I will have to conclude, reluctantly, that the ideas on which I base my optimism were unequal to the task.

Critics could say that they have heard all this before, citing perhaps rule-based systems or neural nets as examples of highly hyped ideas offered by over-stimulated proponents as the answer.

In response, little can be offered, because until the strength of the new paradigm is demonstrated, the contest is really just between a new dream and a variety of already played out dreams.

Of course, with energy, the new paradigm could be compared and contrasted with the old. I prefer to expend that energy in the direction of research leading toward experiments and demonstrations, rather than scholarship leading toward arguments, because I believe that one success has far greater force than speculation about whether any success is likely or possible.