Distributed Lifelong Reinforcement Learning with Sub-Linear Regret
Date:
Dec 12, 2017
Authors:
In this paper, we propose a distributed second- order method for lifelong reinforcement learning (LRL). Upon observing a new task, our algorithm scales state-of-the-art LRL by approximating the Newton direction up-to-any arbitrary precision ε > 0, while guaranteeing accurate solutions. We analyze the theoretical properties of this new method and derive, for the first time to the best of our knowledge, sublinear regret under this setting.
Share on social media