Package Delivery in the AI Lab

As part of our efforts to develop scalable planning techniques for very large stochastic environments, we are currently constructing a software simulator for an example domain, as well as the software infrastructure for managing and extending this domain. These components will provide us with a realistic, extensible, task-driven problem on which to focus research effort as well as allow us to model further real-world scenarios and develop other example problems.

We take a package delivery task as our initial target problem because it displays a number of interesting research issues (primary among which is a large state space) and is closely related to a number of problems of practical interest. In this problem, a robotic agent is faced with the problem of picking up and delivering a variety of packages to locations in a complex spatial environment. We use the Markov decision process (MDP) formalism to describe the environment, problem dynamics, and tasks. In this representation, the world is described by a set of state variables (such as location or battery power) and the vector formed by a specific assignment to each variable describes a particular world state that the robotic agent can find itself in. The set of all possible states, taken together, is the state space of the problem and represents all possible configurations of the world. The effects of the agent's actions on the state variables (changing its position or recharging its battery) are described by a transition function that models the uncertainty in the domain's dynamics. For example, the transition function can represent the probability that a robot drops a package that it's carrying or model the noise in the movements of the robot's actuators. Finally, the MDP models the agent's goals as a reward function that expresses the priority of achieving each goal or the importance of maintaining particular operating conditions such as battery level.

For our implementation of the package delivery problem, we have developed a model of the building housing the MIT AI and LCS labs, a ten-story office building with moderately complex hallway and office arrangements (offices are arranged slightly differently on each floor, for example). Our initial model includes only a fraction of the true complexity of package delivery in this environment, but our software infrastructure allows us to easily extend the domain description to include additional factors or to represent the environment at higher resolution. At the moment, our model includes:

A coarse (roughly 1-meter) quantization of X and Y coordinate (75 and 25 discrete values, respectively)
Floor (10 values)
Set of packages to be delivered (roughly 200 one-bit variables)
Set of packages in the agent's possession (roughly 200 one-bit variables)
Whether doors are locked or unlocked (roughly 100 one-bit variables)

The total state space is the cross-product of the domain of each of these variables (somewhat more than 500 bits or 2⁵⁰⁰ states). While this is still a small approximation of the true problem, it is quite large enough to pose an extreme challenge to current state-of-the-art stochastic planning methods.

Flexible design has been an important criterion in the development of our MDP simulator library. We intend our code base to be both powerful enough to simulate large stochastic systems and general enough to cover a wide range of tasks. In particular, the implementation is entirely in Java for portability, though we are currently constructing a network communication channel so that the simulator can be easily coupled to external software systems in other languages (e.g., a visualization system or a controller for a robot). The data structures support representing the MDP in purely tabular form (appropriate for small domains), as dynamic Bayes networks, as parameterized stochastic decision trees, or with general Java expressions. We currently support binary, discrete range, and enumerated set state variables, though the class library can easily be extended to handle continuous or other variables as well. Our hope is that this library will support not only our own research but that of other research groups currently studying stochastic planning, and a number of other researchers have recently expressed interest in our system.