eduzhai > Physical Sciences > Physics Sciences >

Variance Optimization for Continuous-Time Markov Decision Processes

  • Save

... pages left unread,continue reading

Document pages: 15 pages

Abstract: Thispaper considers the variance optimization problem of average reward incontinuous-time Markov decision process (MDP). It is assumed that the statespace is countable and the action space is Borel measurable space. The mainpurpose of this paper is to find the policy with the minimal variance in thedeterministic stationary policy space. Unlike the traditional Markov decisionprocess, the cost function in the variance criterion will be affected by futureactions. To this end, we convert the variance minimization problem into astandard (MDP) by introducing a concept called pseudo-variance. Further, bygiving the policy iterative algorithm of pseudo-variance optimization problem,the optimal policy of the original variance optimization problem is derived,and a sufficient condition for the variance optimal policy is given. Finally,we use an example to illustrate the conclusion of this paper.

Please select stars to rate!

         

0 comments Sign in to leave a comment.

    Data loading, please wait...
×