eduzhai > Applied Sciences > Computer Science >

Quantum Multiple Q-Learning

  • carsar
  • (0) Download
  • 20210407
  • Save

... pages left unread,continue reading

Document pages: 22 pages

Abstract: In this paper, acollection of value-based quantum reinforcement learning algorithms areintroduced which use Grover’s algorithm to update the policy, which is storedas a superposition of qubits associated with each possible action, and theirparameters are explored. These algorithms may be grouped in two classes, oneclass which uses value functions (V(s)) and new class whichuses action value functions (Q(s,a)). Thenew (Q(s,a))-based quantum algorithmsare found to converge faster than V(s)-based algorithms, andin general the quantum algorithms are found to converge in fewer iterationsthan their classical counterparts, netting larger returns during training. Thisis due to fact that the (Q(s,a)) algorithms are moreprecise than those based on V(s), meaning that updatesare incorporated into the value function more efficiently. This effect is alsoenhanced by the observation that the Q(s,a)-based algorithms maybe trained with higher learning rates. These algorithms are then extended byadding multiple value functions, which are observed to allow larger learningrates and have improved convergence properties in environments with stochasticrewards, the latter of which is further improved by the probabilistic nature ofthe quantum algorithms. Finally, the quantum algorithms were found to use lessCPU time than their classical counterparts overall, meaning that their benefitsmay be realized even without a full quantum computer.

Please select stars to rate!


0 comments Sign in to leave a comment.

    Data loading, please wait...