Reinforcement Learning for POMDP: Rollout and Policy Iteration with Application to Sequential Repair

Citation:

Thomas Wheeler, Ezhil Bharathi, and Stephanie Gil. 5/20/2019. “Reinforcement Learning for POMDP: Rollout and Policy Iteration with Application to Sequential Repair.” IEEE International Conference on Robotics and Automation (ICRA).

Download

559 KB

Abstract:

We study rollout algorithms which combine limited lookahead and terminal cost function approximation in the context of POMDP. We demonstrate their effectiveness in the context of a sequential pipeline repair problem, which also arises in other contexts of search and rescue. We provide performance bounds and empirical validation of the methodology, in both cases of a single rollout iteration, and multiple iterations with intermediate policy space approximations.

Last updated on 07/06/2021

REACT Lab

Robotics, Embedded Autonomy & Communication Theory Lab

Reinforcement Learning for POMDP: Rollout and Policy Iteration with Application to Sequential Repair

Citation:

Abstract:

Recent Publications

a32b86d4035bb9f70c3597622be60c6c

css.react