hide
Free keywords:
-
Abstract:
Many robot control problems of practical importance, including task or operational space control, can be
reformulated as immediate reward reinforcement learning problems.
However, few of the known optimization or reinforcement
learning algorithms can be used in online learning control
for robots, as they are either prohibitively slow, do not scale
to interesting domains of complex robots, or require trying
out policies generated by random search, which are infeasible
for a physical system. Using a generalization of the EM-base
reinforcement learning framework suggested by Dayan amp; Hinton,
we reduce the problem of learning with immediate rewards to a
reward-weighted regression problem with an adaptive, integrated
reward transformation for faster convergence. The resulting
algorithm is efficient, learns smoothly without dangerous jumps
in solution space, and works well in applications of complex high
degree-of-freedom robots.