
Ikusi/ Ireki
Izenburua
On the imputation of missing data for road traffic forecasting: new insights and novel techniquesEgilea
Departamentua
Business Data AnayticsBeste erakundeak
https://ror.org/02fv8hj62https://ror.org/000xsnr85
https://ror.org/03b21sh32
Bertsioa
PreprintaDokumentu-mota
ArtikuluaHizkuntza
IngelesaEskubideak
@ 2018 The authors, published by Elsevier Ltd.Sarbidea
Sarbide irekiaArgitaratzailearen bertsioa
https://doi.org/10.1016/j.trc.2018.02.021Non argitaratua
Tranportation Research. Part C Issue 90 (2018)Lehenengo orria
18Azken orria
33Argitaratzailea
ElsevierGako-hitzak
Traffic forecastingMissing data
Cluster analysis
Data imputation
Gaia (UNESCO Tesauroa)
Hiriko zirkulazioaLaburpena
Vehicle flow forecasting is of crucial importance for the management of road traffic in complex
urban networks, as well as a useful input for route planning algorithms. In general traffic predictive
... [+]
Vehicle flow forecasting is of crucial importance for the management of road traffic in complex
urban networks, as well as a useful input for route planning algorithms. In general traffic predictive
models rely on data gathered by different types of sensors placed on roads, which occasionally
produce faulty readings due to several causes, such as malfunctioning hardware or
transmission errors. Filling in those gaps is relevant for constructing accurate forecasting models,
a task which is engaged by diverse strategies, from a simple null value imputation to complex
spatio-temporal context imputation models. This work elaborates on two machine learning approaches
to update missing data with no gap length restrictions: a spatial context sensing model
based on the information provided by surrounding sensors, and an automated clustering analysis
tool that seeks optimal pattern clusters in order to impute values. Their performance is assessed
and compared to other common techniques and different missing data generation models over
real data captured from the city of Madrid (Spain). The newly presented methods are found to be
fairly superior when portions of missing data are large or very abundant, as occurs in most
practical cases. [-]


















