DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs

Talk at IJCAI 2020


DualSMC is a model-based RL method for continuous POMDPs. The motivation is that filtering and planning under uncertainty can be viewed as two closely related sequential Monte Carlo processes, with one over the states and the other over the future optimal trajectories.

  • Component 1: An adversarial particle filtering approach to fit the multi-modal distribution of belief states given partial observations.
  • Component 2: An SMC planning algorithm to make uncertainty-dependent policies and learn to reduce it.
  • These components are linked via belief states. DualSMC can deal with complex visual observations, and remains highly interpretable.
  • 3D light-dark navigation (simulated by DeepMind Lab)

    The robot receives noisy observations with limited field of vision in the lower part of the maze, and clear observations in the upper part. Green crosses: estimated states. Grey lines: planning trajectories. Red line: moving trajectory. Orange square: goal.

    Reacher: A continuous control task with partial observations