Institutional Repository
Technical University of Crete

EN | EL

Search

Browse

Communities

My Space

Login

Reinforcement learning in multidimensional continuous action spaces

Pazis, J., Lagoudakis Michael

URI	http://purl.tuc.gr/dl/dias/A1A490B2-7764-4BA6-B970-0337EEE33DC8	-
Identifier	https://doi.org/10.1109/ADPRL.2011.5967381	-
Language	en	-
Extent	8 pages	en
Title	Reinforcement learning in multidimensional continuous action spaces	en
Creator	Pazis, J.	en
Creator	Lagoudakis Michael	en
Creator	Λαγουδακης Μιχαηλ	el
Content Summary	The majority of learning algorithms available today focus on approximating the state (V ) or state-action (Q) value function and efficient action selection comes as an afterthought. On the other hand, real-world problems tend to have large action spaces, where evaluating every possible action becomes impractical. This mismatch presents a major obstacle in successfully applying reinforcement learning to real-world problems. In this paper we present an effective approach to learning and acting in domains with multidimensional and/or continuous control variables where efficient action selection is embedded in the learning process. Instead of learning and representing the state or state-action value function of the MDP, we learn a value function over an implied augmented MDP, where states represent collections of actions in the original MDP and transitions represent choices eliminating parts of the action space at each step. Action selection in the original MDP is reduced to a binary search by the agent in the transformed MDP, with computational complexity logarithmic in the number of actions, or equivalently linear in the number of action dimensions. Our method can be combined with any discrete-action reinforcement learning algorithm for learning multidimensional continuous-action policies using a state value approximator in the transformed MDP. Our preliminary results with two well-known reinforcement learning algorithms (Least-Squares Policy Iteration and Fitted Q-Iteration) on two continuous action domains (1-dimensional inverted pendulum regulator, 2-dimensional bicycle balancing) demonstrate the viability and the potential of the proposed approach.	en
Type of Item	Πλήρης Δημοσίευση σε Συνέδριο	el
Type of Item	Conference Full Paper	en
License	http://creativecommons.org/licenses/by/4.0/	en
Date of Item	2015-11-13	-
Date of Publication	2011	-
Subject	Reinforcement learning	en
Bibliographic Citation	J. Pazis and M. G. Lagoudakis, “Reinforcement learning in multidimensional continuous action spaces,” in 2011 IEEE Intern. Symp. Adapt. Dynam. Program. Reinf. Learn.(ADPRL), pp. 97 - 104. doi:10.1109/ADPRL.2011.5967381	en

Services

Statistics

Copyright © DIAS 2013