Interactive Robot Training for Complex Tasks

Ankit Shah

August 2021

PDF Video

Abstract

Domains such as high-mix manufacturing, domestic robotics, space exploration, etc., are key areas of interest for robotics. In these domains, it is difficult to anticipate the exact role of the robot apriori, therefore defining the robot specifications is challenging. This presents a crucial hurdle to widespread adoption of robots in these domains. Developing robots that can be re-programmed easily during deployment by domain experts, through the modification of the task specifications, without requiring extensive programming knowledge is a key research thrust of this dissertation.

I present a multi-modal framework for training a robot through demonstrations and acceptability assessments provided by the teacher as per their intended task specification. I adopt an online Bayesian approach, where the robot maintains a belief over the teacher’s intended task specification, and each input provided by the teacher iteratively updates the robot’s belief. Further, I enabled the robot to infer task specifications that require satisfaction of temporal properties by utilizing a well-defined fragment of linear temporal logic (LTL). Towards developing this framework, I address three key research questions.

I begin by presenting a novel approach to inferring formal temporal specifications from labeled task executions, called Bayesian specification inference. This approach can learn tasks expressed by an expressive but relevant fragment of LTL while modeling the ambiguity of demonstrations as a belief distribution over candidate LTL formulas. We demonstrate the utility of this approach in inferring task specifications for the representative multi-step manipulation task of setting a dinner table. We also utilize this model to learn an assessment model for multi-aircraft combat missions that shows a high degree of alignment with the assessments provided by a domain expert.

Next, I present planning with uncertain specifications (PUnS), a novel formulation that enables planning with a belief distribution over the true specification. I propose four evaluation criteria that capture the semantics of satisfying a belief over logical formulas and demonstrate the existence of an equivalent Markov decision process (MDP) for every instance of a PUnS problem. We show that the robot policies produced through the PUnS formulation demonstrate flexibility by generating distinct valid task executions and result in a low error rate by simultaneously satisfying a maximal subset of the specifications in the belief distribution.

Finally, I present an integrated specification inference framework that interleaves inference and planning through active learning. Our models for active learning allow the robot to identify whether a task demonstration or an assessment of its task execution provided by the teacher would be most beneficial in refining its belief. Further, we developed algorithms that enable the robot to identify and perform the task execution that would be most informative in refining its uncertainty. We explore the impact of different information utility functions and the degree of teacher’s pedagogical selectivity on the robot’s learning performance, and demonstrate that allowing the robot to select the ideal learning modality allows it to overcome the limitations of a non-pedagogical teacher, and still converge to the true task specification. We also demonstrate our framework through a study involving users teaching a robot to set a dinner table with only five task executions.

Type

Thesis

Publication

Massachusetts Institute of Technology