Context-based policy search: transfer of experience across problems ----------------------------_______________________________________ Leonid Peshkin {pesha@ai.mit.edu} Harvard Center for Artificial Intelligence Dworkin Bld. 134, Cambridge, MA 02138, USA Edwin D. de Jong {edwin@cs.brandeis.edu Computer Science Department Brandeis University, Waltham, MA 02454-9110 Appears in proceedings of ICML-02 workshop on Development of Representations The current version and related work available from http://www.eecs.harvard.edu/~pesha/papers.html Abstract: An important question in reinforcement learning is how generalization may be performed. This problem is especially important if the learning agent receives only partial information about the state of the environment. Typically, the bias required for generalization is chosen by the experimenter. Here, we investigate a way for the learning method to extract bias from learning one problem and apply it in subsequent problems. We use a gradient-based policy search method, and look for controllers that consist of a context component and an action component. Empirical results on a two-agent coordination problem are reported. It was found that learning a bias made it possible to address problems that were not solved otherwise.