Data-efficient reinforcement learning for variable impedance control

Abu-Dakka, Fares J.

dc.rights.license	Attribution-NonCommercial-NoDerivatives 4.0 International	*
dc.contributor.author	Abu-Dakka, Fares J.
dc.contributor.other	Anand, Akhil S.
dc.contributor.other	Kaushik, Rituraj
dc.contributor.other	Gravdahl, Jan Tommy
dc.date.accessioned	2024-04-18T12:46:35Z
dc.date.available	2024-04-18T12:46:35Z
dc.date.issued	2024
dc.identifier.issn	2169-3536	en
dc.identifier.other	https://katalogoa.mondragon.edu/janium-bin/janium_login_opac.pl?find&ficha_no=174317	en
dc.identifier.uri	https://hdl.handle.net/20.500.11984/6359
dc.description.abstract	One of the most crucial steps toward achieving human-like manipulation skills in robots is to incorporate compliance into the robot controller. Compliance not only makes the robot’s behaviour safe but also makes it more energy efficient. In this direction, the variable impedance control (VIC) approach provides a framework for a robot to adapt its compliance during execution by employing an adaptive impedance law. Nevertheless, autonomously adapting the compliance profile as demanded by the task remains a challenging problem to be solved in practice. In this work, we introduce a reinforcement learning (RL)-based approach called DEVILC (Data-Efficient Variable Impedance Learning Controller) to learn the variable impedance controller through real-world interaction of the robot. More concretely, we use a model-based RL approach in which, after every interaction, the robot iteratively learns a probabilistic model of its dynamics using the Gaussian process regression model. The model is then used to optimize a neural-network policy that modulates the robot’s impedance such that the long-term reward for the task is maximized. Thanks to the model-based RL framework, DEVILC allows a robot to learn the VIC policy with only a few interactions, making it practical for real-world applications. In simulations and experiments, we evaluate DEVILC on a Franka Emika Panda robotic manipulator for different manipulation tasks in the Cartesian space. The results show that DEVILC is a promising direction toward autonomously learning compliant manipulation skills directly in the real world through interactions. A video of the experiments is available in the link: https://youtu.be/_uyr0Vye5no .	en
dc.language.iso	eng	en
dc.publisher	IEEE	en
dc.rights	© 2024 The Authors	en
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/4.0/	*
dc.subject	Model-based reinforcement learning	en
dc.subject	variable impedance learning control	en
dc.subject	Gaussian processes	en
dc.subject	covariance matrix adaptation	en
dc.title	Data-efficient reinforcement learning for variable impedance control	en
dcterms.accessRights	http://purl.org/coar/access_right/c_abf2	en
dcterms.source	IEEE Access	en
local.contributor.group	Robótica y automatización	es
local.description.peerreviewed	true	en
local.identifier.doi	https://doi.org/10.1109/ACCESS.2024.3355311	en
local.contributor.otherinstitution	https://ror.org/020hwjq30	en
local.contributor.otherinstitution	https://ror.org/05xg72x27	en
local.source.details	Vol 12
oaire.format.mimetype	application/pdf	en
oaire.file	$DSPACE\assetstore	en
oaire.resourceType	http://purl.org/coar/resource_type/c_6501	en
oaire.version	http://purl.org/coar/version/c_970fb48d4fbd8a85	en
oaire.funderName	The Research Council of Norway	en
oaire.funderName	Gobierno Vasco	en
oaire.funderName	Gobierno Vasco	en
oaire.funderIdentifier	https://ror.org/00epmv149
oaire.funderIdentifier	https://ror.org/00pz2fp31 http://data.crossref.org/fundingdata/funder/10.13039/501100003086
oaire.funderIdentifier	https://ror.org/00pz2fp31 http://data.crossref.org/fundingdata/funder/10.13039/501100003086
oaire.fundingStream	IKTPLUS-ICT and digital innovation	en
oaire.fundingStream	Elkartek 2022	en
oaire.fundingStream	Elkartek 2023	en
oaire.awardNumber	270941	en
oaire.awardNumber	KK-2022-00024	en
oaire.awardNumber	KK-2023-00055	en
oaire.awardTitle	Dynamic Robot Interaction and Motion Compensation	en
oaire.awardTitle	Producción Fluída y Resiliente para la Industria inteligente (PROFLOW)	en
oaire.awardTitle	Tecnologías de Inteligencia Artificial para la percepción visual y háptica y la planificación y control de tareas de manipulación (HELDU)	en
oaire.awardURI	Sin información	en
oaire.awardURI	Sin información	en
oaire.awardURI	Sin información	en