eduzhai > Applied Sciences > Engineering >

Multi-task Reinforcement Learning in Reproducing Kernel Hilbert Spaces via Cross-learning

  • king
  • (0) Download
  • 20210507
  • Save

... pages left unread,continue reading

Document pages: 16 pages

Abstract: Reinforcement learning (RL) is a framework to optimize a control policy usingrewards that are revealed by the system as a response to a control action. Inits standard form, RL involves a single agent that uses its policy toaccomplish a specific task. These methods require large amounts of rewardsamples to achieve good performance, and may not generalize well when the taskis modified, even if the new task is related. In this paper we are interestedin a collaborative scheme in which multiple agents with different tasksoptimize their policies jointly. To this end, we introduce cross-learning, inwhich agents tackling related tasks have their policies constrained to be closeto one another. Two properties make our new approach attractive: (i) itproduces a multi-task central policy that can be used as a starting point toadapt quickly to one of the tasks trained for, in a situation when the agentdoes not know which task is currently facing, and (ii) as in meta-learning, itadapts to environments related but different to those seen during training. Wefocus on continuous policies belonging to reproducing kernel Hilbert spaces forwhich we bound the distance between the task-specific policies and thecross-learned policy. To solve the resulting optimization problem, we resort toa projected policy gradient algorithm and prove that it converges to anear-optimal solution with high probability. We evaluate our methodology with anavigation example in which agents can move through environments with obstaclesof multiple shapes and avoid obstacles not trained for.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...