Erregistro soila

dc.rights.licenseAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.contributor.authorElguea Aguinaco, Iñigo
dc.contributor.authorAguirre, Aitor
dc.contributor.authorIzagirre, Unai
dc.contributor.authorInziarte Hidalgo, Ibai
dc.contributor.authorBogh, Simon
dc.contributor.authorArana-Arexolaleiba, Nestor
dc.date.accessioned2025-02-13T09:36:03Z
dc.date.available2025-02-13T09:36:03Z
dc.date.issued2025
dc.identifier.issn1872-7409en
dc.identifier.otherhttps://katalogoa.mondragon.edu/janium-bin/janium_login_opac.pl?find&ficha_no=179940en
dc.identifier.urihttps://hdl.handle.net/20.500.11984/6898
dc.description.abstractLearning to perform procedural motion or manipulation tasks in unstructured or uncertain environments poses significant challenges for intelligent agents. Although reinforcement learning algorithms have demonstrated positive results on simple tasks, the hard-to-engineer reward functions and the impractical amount of trial-and-error iterations these agents require in long-experience streams still present challenges for deployment in industrially relevant environments. In this regard, interactive reinforcement learning has emerged as a promising approach to mitigate these limitations, whereby a human supervisor provides evaluative or corrective feedback to the learning agent during training. However, the requirement of a human-in-the-loop approach throughout the learning process can be impractical for tasks that span several hours. This study aims to overcome this limitation by automating the learning process and substituting human feedback with an artificial supervisor grounded in constraint-based modeling techniques. In contrast to the logical constraints commonly used for conventional reinforcement learning, constraint-based modeling techniques offer enhanced adaptability in terms of conceptualizing and modeling the human knowledge of a task. This modeling capability allows an automated supervisor to acquire a closer approximation to human reasoning by dividing complex tasks into more manageable components and identifying the associated subtask and contextual cues in which the agent is involved. The supervisor then adjusts the evaluative and corrective feedback to suit the specific subtask under consideration. The framework was assessed using three actor-critic agents in a human–robot interaction environment, demonstrating a sample efficiency improvement of 50% and success rates ofes
dc.language.isoengen
dc.publisherElsevieren
dc.rights© 2024 The Authorsen
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectAutomated supervisoren
dc.subjectContact-rich manipulationen
dc.subjectIndustrial manipulatorsen
dc.subjectInteractive reinforcement learningen
dc.subjectSample efficiencyen
dc.subjectProcedural tasksen
dc.titleNovel automated interactive reinforcement learning framework with a constraint-based supervisor for procedural tasksen
dcterms.accessRightshttp://purl.org/coar/access_right/c_abf2en
dcterms.sourceKnowledge-Based Systemsen
local.contributor.groupAnálisis de datos y ciberseguridades
local.contributor.groupRobótica y automatizaciónes
local.description.peerreviewedtrueen
local.identifier.doihttps://doi.org/10.1016/j.knosys.2024.112870en
local.rights.publicationfeeAPCen
local.rights.publicationfeeamount3000en
local.contributor.otherinstitutionElectrotecnica Alavesa S.L.es
local.contributor.otherinstitutionMontajes Mantenimiento y Automatismos Eléctricos Navarraes
local.contributor.otherinstitutionhttps://ror.org/04m5j1k67es
local.source.detailsVol. 309. N. art. 112870. 30 January, 2025en
oaire.format.mimetypeapplication/pdfen
oaire.file$DSPACE\assetstoreen
oaire.resourceTypehttp://purl.org/coar/resource_type/c_6501en
oaire.versionhttp://purl.org/coar/version/c_970fb48d4fbd8a85en
dc.unesco.tesaurohttp://vocabularies.unesco.org/thesaurus/concept3401en
oaire.funderNameGobierno Vascoen
oaire.funderIdentifierhttps://ror.org/00pz2fp31 / http://data.crossref.org/fundingdata/funder/10.13039/501100003086en
oaire.fundingStreamIkertalde Convocatoria 2022-2025en
oaire.awardNumberIT1676-22en
oaire.awardTitleGrupo de sistemas inteligentes para sistemas industriales (IKERTALDE 2022-2025)en
oaire.awardURISin informaciónen
dc.unesco.clasificacionhttp://skos.um.es/unesco6/331101en


Item honetako fitxategiak

Thumbnail
Thumbnail

Item hau honako bilduma honetan/hauetan agertzen da

Erregistro soila

Attribution-NonCommercial-NoDerivatives 4.0 International
Bestelakorik adierazi ezean, itemaren baimena horrela deskribatzen da: Attribution-NonCommercial-NoDerivatives 4.0 International