# Item

ITEM ACTIONSEXPORT

Released

Conference Paper

#### Learning Operational Space Control

##### MPS-Authors

There are no MPG-Authors in the publication available

##### External Resource

http://www.roboticsproceedings.org/rss02/p33.pdf

(Publisher version)

##### Fulltext (restricted access)

There are currently no full texts shared for your IP range.

##### Fulltext (public)

There are no public fulltexts stored in PuRe

##### Supplementary Material (public)

There is no public supplementary material available

##### Citation

Peters, J., & Schaal, S. (2007). Learning Operational Space Control. In G. Sukhatme,
S. Schaal, W. Burgard, & D. Fox (*Robotics: Science
and Systems II* (pp. 255-262). Cambridge, MA, USA: MIT Press.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-CE23-3

##### Abstract

While operational space control is of essential importance

for robotics and well-understood from an analytical

point of view, it can be prohibitively hard to achieve accurate

control in face of modeling errors, which are inevitable in

complex robots, e.g., humanoid robots. In such cases, learning

control methods can offer an interesting alternative to analytical

control algorithms. However, the resulting learning problem is

ill-defined as it requires to learn an inverse mapping of a

usually redundant system, which is well known to suffer from

the property of non-convexity of the solution space, i.e., the

learning system could generate motor commands that try to

steer the robot into physically impossible configurations. A first

important insight for this paper is that, nevertheless, a physically

correct solution to the inverse problem does exit when learning

of the inverse map is performed in a suitable piecewise linear

way. The second crucial component for our work is based on

a recent insight that many operational space controllers can be

understood in terms of a constraint optimal control problem.

The cost function associated with this optimal control problem

allows us to formulate a learning algorithm that automatically

synthesizes a globally consistent desired resolution of redundancy

while learning the operational space controller. From the view

of machine learning, the learning problem corresponds to a

reinforcement learning problem that maximizes an immediate

reward and that employs an expectation-maximization policy

search algorithm. Evaluations on a three degrees of freedom

robot arm illustrate the feasibility of the suggested approach.

for robotics and well-understood from an analytical

point of view, it can be prohibitively hard to achieve accurate

control in face of modeling errors, which are inevitable in

complex robots, e.g., humanoid robots. In such cases, learning

control methods can offer an interesting alternative to analytical

control algorithms. However, the resulting learning problem is

ill-defined as it requires to learn an inverse mapping of a

usually redundant system, which is well known to suffer from

the property of non-convexity of the solution space, i.e., the

learning system could generate motor commands that try to

steer the robot into physically impossible configurations. A first

important insight for this paper is that, nevertheless, a physically

correct solution to the inverse problem does exit when learning

of the inverse map is performed in a suitable piecewise linear

way. The second crucial component for our work is based on

a recent insight that many operational space controllers can be

understood in terms of a constraint optimal control problem.

The cost function associated with this optimal control problem

allows us to formulate a learning algorithm that automatically

synthesizes a globally consistent desired resolution of redundancy

while learning the operational space controller. From the view

of machine learning, the learning problem corresponds to a

reinforcement learning problem that maximizes an immediate

reward and that employs an expectation-maximization policy

search algorithm. Evaluations on a three degrees of freedom

robot arm illustrate the feasibility of the suggested approach.