Izenburua
Practical approaches towards IoT dataset generation for security experimentsEgilea
Beste erakundeak
IkerlanBertsioa
Preprinta
Eskubideak
© 2025 Elsevier IncSarbidea
Sarbide bahituaArgitaratzailearen bertsioa
https://doi.org/10.1016/B978-0-44-329032-9.00017-8Non argitaratua
Advanced Machine Learning for Cyber-Attack Detection in IoT Networks Chapter 12Argitaratzailea
ElsevierGako-hitzak
Botnet
Emulation
Internet of Things
Machine learning ... [+]
Emulation
Internet of Things
Machine learning ... [+]
Botnet
Emulation
Internet of Things
Machine learning
Network security
testbed
ODS 4 Educación de calidad
ODS 9 Industria, innovación e infraestructura [-]
Emulation
Internet of Things
Machine learning
Network security
testbed
ODS 4 Educación de calidad
ODS 9 Industria, innovación e infraestructura [-]
Gaia (UNESCO Tesauroa)
Datuen babesaLaburpena
The cybersecurity field has been steadily adopting rapid advances in artificial intelligence (AI) and machine learning (ML) techniques for various purposes, such as threat detection and response, with ... [+]
The cybersecurity field has been steadily adopting rapid advances in artificial intelligence (AI) and machine learning (ML) techniques for various purposes, such as threat detection and response, with promising results. Obtaining high-quality data for model training is fundamental to creating robust solutions; however, the scarcity of IoT security datasets remains a limiting factor in developing ML-based security systems for IoT scenarios. Broadly, there are two methods for generating datasets: using physical IoT hardware on operational networks and employing virtualization-based systems. The former provides accurate and representative data but can be costly, time-consuming, difficult to adapt, and potentially risky. On the other hand, the latter offers a safer, more flexible, and cost-effective approach for various research purposes, despite not replicating exact hardware conditions. This chapter will delve into the practical process of dataset generation from the point of view of these two approaches. First, regarding the virtualized approach, we will leverage the recently published Gotham testbed, a reproducible, flexible, and extendable security testbed based on emulated nodes that mixes containerization and virtual machine technologies. This testbed can be used to generate various datasets of network traces, including activities from real malware emulated in the platform or real attack activities from the internet interacting with the testbed. Then, based on the VARIoT project, we will explore the platform and methodology to create datasets of IoT traffic under realistic conditions, including both legitimate and malicious traces, using a laboratory set of physical IoT hardware devices. [-]