dc.rights.license | Attribution-NonCommercial-NoDerivatives 4.0 International | * |
dc.contributor.author | Elguea Aguinaco, Iñigo | |
dc.contributor.author | Aguirre, Aitor | |
dc.contributor.author | Izagirre, Unai | |
dc.contributor.author | Inziarte Hidalgo, Ibai | |
dc.contributor.author | Bogh, Simon | |
dc.contributor.author | Arana-Arexolaleiba, Nestor | |
dc.date.accessioned | 2025-02-13T09:36:03Z | |
dc.date.available | 2025-02-13T09:36:03Z | |
dc.date.issued | 2025 | |
dc.identifier.issn | 1872-7409 | en |
dc.identifier.other | https://katalogoa.mondragon.edu/janium-bin/janium_login_opac.pl?find&ficha_no=179940 | en |
dc.identifier.uri | https://hdl.handle.net/20.500.11984/6898 | |
dc.description.abstract | Learning to perform procedural motion or manipulation tasks in unstructured or uncertain environments poses significant challenges for intelligent agents. Although reinforcement learning algorithms have demonstrated positive results on simple tasks, the hard-to-engineer reward functions and the impractical amount of trial-and-error iterations these agents require in long-experience streams still present challenges for deployment in industrially relevant environments. In this regard, interactive reinforcement learning has emerged as a promising approach to mitigate these limitations, whereby a human supervisor provides evaluative or corrective feedback to the learning agent during training. However, the requirement of a human-in-the-loop approach throughout the learning process can be impractical for tasks that span several hours. This study aims to overcome this limitation by automating the learning process and substituting human feedback with an artificial supervisor grounded in constraint-based modeling techniques. In contrast to the logical constraints commonly used for conventional reinforcement learning, constraint-based modeling techniques offer enhanced adaptability in terms of conceptualizing and modeling the human knowledge of a task. This modeling capability allows an automated supervisor to acquire a closer approximation to human reasoning by dividing complex tasks into more manageable components and identifying the associated subtask and contextual cues in which the agent is involved. The supervisor then adjusts the evaluative and corrective feedback to suit the specific subtask under consideration. The framework was assessed using three actor-critic agents in a human–robot interaction environment, demonstrating a sample efficiency improvement of 50% and success rates of | es |
dc.language.iso | eng | en |
dc.publisher | Elsevier | en |
dc.rights | © 2024 The Authors | en |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Automated supervisor | en |
dc.subject | Contact-rich manipulation | en |
dc.subject | Industrial manipulators | en |
dc.subject | Interactive reinforcement learning | en |
dc.subject | Sample efficiency | en |
dc.subject | Procedural tasks | en |
dc.title | Novel automated interactive reinforcement learning framework with a constraint-based supervisor for procedural tasks | en |
dcterms.accessRights | http://purl.org/coar/access_right/c_abf2 | en |
dcterms.source | Knowledge-Based Systems | en |
local.contributor.group | Análisis de datos y ciberseguridad | es |
local.contributor.group | Robótica y automatización | es |
local.description.peerreviewed | true | en |
local.identifier.doi | https://doi.org/10.1016/j.knosys.2024.112870 | en |
local.rights.publicationfee | APC | en |
local.rights.publicationfeeamount | 3000 | en |
local.contributor.otherinstitution | Electrotecnica Alavesa S.L. | es |
local.contributor.otherinstitution | Montajes Mantenimiento y Automatismos Eléctricos Navarra | es |
local.contributor.otherinstitution | https://ror.org/04m5j1k67 | es |
local.source.details | Vol. 309. N. art. 112870. 30 January, 2025 | en |
oaire.format.mimetype | application/pdf | en |
oaire.file | $DSPACE\assetstore | en |
oaire.resourceType | http://purl.org/coar/resource_type/c_6501 | en |
oaire.version | http://purl.org/coar/version/c_970fb48d4fbd8a85 | en |
dc.unesco.tesauro | http://vocabularies.unesco.org/thesaurus/concept3401 | en |
oaire.funderName | Gobierno Vasco | en |
oaire.funderIdentifier | https://ror.org/00pz2fp31 / http://data.crossref.org/fundingdata/funder/10.13039/501100003086 | en |
oaire.fundingStream | Ikertalde Convocatoria 2022-2025 | en |
oaire.awardNumber | IT1676-22 | en |
oaire.awardTitle | Grupo de sistemas inteligentes para sistemas industriales (IKERTALDE 2022-2025) | en |
oaire.awardURI | Sin información | en |
dc.unesco.clasificacion | http://skos.um.es/unesco6/331101 | en |