Citation:
Thomas Wheeler, Ezhil Bharathi, and Stephanie Gil. 5/20/2019. “Reinforcement Learning for POMDP: Rollout and Policy Iteration with Application to Sequential Repair.” IEEE International Conference on Robotics and Automation (ICRA).
Reinforcement Learning for POMDP: Rollout and Policy Iteration with Application to Sequential Repair | 559 KB |
Abstract:
We study rollout algorithms which combine limited lookahead and terminal cost function approximation in the context of POMDP. We demonstrate their effectiveness in the context of a sequential pipeline repair problem, which also arises in other contexts of search and rescue. We provide performance bounds and empirical validation of the methodology, in both cases of a single rollout iteration, and multiple iterations with intermediate policy space approximations.See also: Multi-Robot Sequential Decision Making