Goal-Conditioned Reinforcement Learning within a Human-Robot Disassembly Environment

Arana-Arexolaleiba, Nestor

dc.rights.license	Attribution 4.0 International	*
dc.contributor.author	Arana-Arexolaleiba, Nestor
dc.contributor.other	Elguea, Íñigo
dc.contributor.other	Serrano Muñoz, Antonio
dc.contributor.other	Chrysostomou, Dimitrios
dc.contributor.other	Inziarte Hidalgo, Ibai
dc.contributor.other	Bogh, Simon
dc.date.accessioned	2022-11-29T10:51:11Z
dc.date.available	2022-11-29T10:51:11Z
dc.date.issued	2022
dc.identifier.issn	2076-3417	en
dc.identifier.other	https://katalogoa.mondragon.edu/janium-bin/janium_login_opac.pl?find&ficha_no=170351	en
dc.identifier.uri	https://hdl.handle.net/20.500.11984/5895
dc.description.abstract	The introduction of collaborative robots in industrial environments reinforces the need to provide these robots with better cognition to accomplish their tasks while fostering worker safety without entering into safety shutdowns that reduce workflow and production times. This paper presents a novel strategy that combines the execution of contact-rich tasks, namely disassembly, with real-time collision avoidance through machine learning for safe human-robot interaction. Specifically, a goal-conditioned reinforcement learning approach is proposed, in which the removal direction of a peg, of varying friction, tolerance, and orientation, is subject to the location of a human collaborator with respect to a 7-degree-of-freedom manipulator at each time step. For this purpose, the suitability of three state-of-the-art actor-critic algorithms is evaluated, and results from simulation and real-world experiments are presented. In reality, the policy’s deployment is achieved through a new scalable multi-control framework that allows a direct transfer of the control policy to the robot and reduces response times. The results show the effectiveness, generalization, and transferability of the proposed approach with two collaborative robots against static and dynamic obstacles, leveraging the set of available solutions in non-monotonic tasks to avoid a potential collision with the human worker.	en
dc.description.sponsorship	Comisión Europea	es
dc.description.sponsorship	Gobierno Vasco-Eusko Jaurlaritza	es
dc.language.iso	eng	en
dc.publisher	MDPI	en
dc.rights	© 2022 The Authors	en
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/	*
dc.subject	collaborative robots	en
dc.subject	machine learning	en
dc.subject	reinforcement learning	en
dc.subject	contact-rich tasks	en
dc.subject	Disassembly	en
dc.subject	collision avoidance	en
dc.title	Goal-Conditioned Reinforcement Learning within a Human-Robot Disassembly Environment	en
dcterms.accessRights	http://purl.org/coar/access_right/c_abf2	en
dcterms.source	Applied Sciences	en
local.contributor.group	Robótica y automatización	es
local.description.peerreviewed	true	en
local.identifier.doi	https://doi.org/10.3390/app122211610	en
local.relation.projectID	info:eu-repo/grantAgreement/EC/H2020-ECSEL/no 876852/EU/Verification and Validation of Automated Systems’ Safety and Security/VALU3S	en
local.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/857061/EU/Networking for research and development of human interactive and sensitive robotics taking advantage of additive manufacturing/R2P2	en
local.relation.projectID	Bikaintek 2020	en
local.rights.publicationfee	APC	en
local.rights.publicationfeeamount	2090.93€	en
local.contributor.otherinstitution	https://ror.org/00wvqgd19
local.contributor.otherinstitution	https://ror.org/04m5j1k67	en
local.contributor.otherinstitution	Electrotecnica Alavesa S.L.	es
local.source.details	Vol. 12. Nº 22. Article 11610. November, 2022	en
oaire.format.mimetype	application/pdf
oaire.file	$DSPACE\assetstore
oaire.resourceType	http://purl.org/coar/resource_type/c_6501	en
oaire.version	http://purl.org/coar/version/c_970fb48d4fbd8a85	en