Simple record

dc.rights.licenseAttribution 4.0 International
dc.contributor.authorIsasa Reinoso, Imanol
dc.contributor.authorAlberdi Aramendi, Ane
dc.contributor.otherHernandez, Mikel
dc.contributor.otherEpelde, Gorka
dc.contributor.otherLondoño, Francisco
dc.contributor.otherBeristain, Andoni
dc.contributor.otherLarrea Lizartza, Xabat
dc.contributor.otherBamidis, Panagiotis
dc.contributor.otherKonstantinidis, Evdokimos
dc.date.accessioned2024-02-02T08:53:03Z
dc.date.available2024-02-02T08:53:03Z
dc.date.issued2024
dc.identifier.issn1472-6947
dc.identifier.otherhttps://katalogoa.mondragon.edu/janium-bin/janium_login_opac.pl?find&ficha_no=175904
dc.identifier.urihttps://hdl.handle.net/20.500.11984/6198
dc.description.abstractBackground Synthetic data is an emerging approach for addressing legal and regulatory concerns in biomedical research that deals with personal and clinical data, whether as a single tool or through its combination with other privacy enhancing technologies. Generating uncompromised synthetic data could significantly benefit external researchers performing secondary analyses by providing unlimited access to information while fulfilling pertinent regulations. However, the original data to be synthesized (e.g., data acquired in Living Labs) may consist of subjects’ metadata (static) and a longitudinal component (set of time-dependent measurements), making it challenging to produce coherent synthetic counterparts. Methods Three synthetic time series generation approaches were defined and compared in this work: only generating the metadata and coupling it with the real time series from the original data (A1), generating both metadata and time series separately to join them afterwards (A2), and jointly generating both metadata and time series (A3). The comparative assessment of the three approaches was carried out using two different synthetic data generation models: the Wasserstein GAN with Gradient Penalty (WGAN-GP) and the DöppelGANger (DGAN). The experiments were performed with three different healthcare-related longitudinal datasets: Treadmill Maximal Effort Test (TMET) measurements from the University of Malaga (1), a hypotension subset derived from the MIMIC-III v1.4 database (2), and a lifelogging dataset named PMData (3). Results Three pivotal dimensions were assessed on the generated synthetic data: resemblance to the original data (1), utility (2), and privacy level (3). The optimal approach fluctuates based on the assessed dimension and metric. Conclusion The initial characteristics of the datasets to be synthesized play a crucial role in determining the best approach. Coupling synthetic metadata with real time series (A1), as well as jointly generating synthetic time series and metadata (A3), are both competitive methods, while separately generating time series and metadata (A2) appears to perform more poorly overall.en
dc.language.isoeng
dc.publisherSpringer Nature
dc.rights© 2024 The Authors
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/
dc.subjectTime seriesen
dc.subjectSynthetic dataen
dc.subjectPrivacy-preserving data sharingen
dc.subjectHealth dataen
dc.titleComparative assessment of synthetic time series generation approaches in healthcare: leveraging patient metadata for accurate data synthesis
dcterms.accessRightshttp://purl.org/coar/access_right/c_abf2
dcterms.sourceBMC Medical Informatics and Decision Making
local.contributor.groupAnálisis de datos y ciberseguridad
local.description.peerreviewedtrue
local.identifier.doihttps://doi.org/10.1186/s12911-024-02427-0
local.contributor.otherinstitutionhttps://ror.org/0023sah13
local.contributor.otherinstitutionhttps://ror.org/000xsnr85
local.contributor.otherinstitutionhttps://ror.org/01a2wsa50
local.contributor.otherinstitutionhttps://ror.org/02j61yw88
local.contributor.otherinstitutionhttps://ror.org/05vpgt980
local.source.detailsVol. 24. N. 1. N. art. 27, 2024
oaire.format.mimetypeapplication/pdf
oaire.file$DSPACE\assetstore
oaire.resourceTypehttp://purl.org/coar/resource_type/c_6501
oaire.versionhttp://purl.org/coar/version/c_970fb48d4fbd8a85
oaire.funderNameComisión Europea
oaire.funderIdentifierhttps://ror.org/00pz2fp31 http://data.crossref.org/fundingdata/funder/10.13039/501100003086
oaire.fundingStreamH2020
oaire.awardNumber101007990
oaire.awardTitleVIrtual healTh And weLlbeing Living Lab InftraStructurE (VITALISE)
oaire.awardURISin información


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Simple record

Attribution 4.0 International
Except where otherwise noted, this item's license is described as Attribution 4.0 International