Active learning using mean shift optimization for robot grasping

Kroemer, O; Detry, R; Piater, J; Peters, J

doi:10.1109/IROS.2009.5354345

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Conference Paper

Active learning using mean shift optimization for robot grasping

MPS-Authors

/persons/resource/persons84027

Kroemer, O
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84135

Peters, J
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

External Resource

https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5354345
(Publisher version)

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Kroemer, O., Detry, R., Piater, J., & Peters, J. (2009). Active learning using mean shift optimization for robot grasping. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2009) (pp. 2610-2615). Piscataway, NJ, USA: IEEE Service Center.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-C278-B

Abstract

When children learn to grasp a new object, they often know several possible grasping points from observing a parentlsquo;s demonstration and subsequently learn better grasps by trial and error. From a machine learning point of view, this process is an active learning approach. In this paper, we present a new robot learning framework for reproducing this ability in robot grasping. For doing so, we chose a straightforward approach: first, the robot observes a few good grasps by demonstration and learns a value function for these grasps using Gaussian process regression. Subsequently, it chooses grasps which are optimal with respect to this value function using a mean-shift optimization approach, and tries them out on the real system. Upon every completed trial, the value function is updated, and in the following trials it is more likely to choose even better grasping points. This method exhibits fast learning due to the data-efficiency of Gaussian process regression framework and the fact th
at t
he mean-shift method provides maxima of this cost function. Experiments were repeatedly carried out successfully on a real robot system. After less than sixty trials, our system has adapted its grasping policy to consistently exhibit successful grasps.