Datasets
VeReMiAP: A VeReMi-Based Dataset for Predicting the Effect of Attacks in VANETs
The collection and construction of this synthetic network dataset is an extended version of the VeReMi dataset, built using the Framework For Misbehavior Detection (F2MD). The new dataset incorporates three key elements - CAM messages, a newly added class of attacks known as the "Fake Reporting Attack," and an evaluation of the impact of this attack, which in this case manifests as a road hazard. (2023-08-21)
Abdelmaguid, Mohammed A.; Hassanein, Hossam S.; Zulkernine, Mohammad, 2023, "VeReMiAP: A VeReMi-based Dataset for Predicting the Effect of Attacks in VANETs", https://doi.org/10.5683/SP3/R09EWA, Borealis, V1
The collection and construction of this synthetic network dataset is an extended version of the VeReMi dataset, built using the Framework For Misbehavior Detection (F2MD). The new dataset incorporates three key elements - CAM messages, a newly added class of attacks known as the "Fake Reporting Attack," and an evaluation of the impact of this attack, which in this case manifests as a road hazard. (2023-08-21)
Abdelmaguid, Mohammed A.; Hassanein, Hossam S.; Zulkernine, Mohammad, 2023, "VeReMiAP: A VeReMi-based Dataset for Predicting the Effect of Attacks in VANETs", https://doi.org/10.5683/SP3/R09EWA, Borealis, V1
Outdoor Temperature Data Collected by Taxis in Rome
The CRAWDAD Roma/taxi dataset's original dataset comprises each taxicab's position using GPS. This dataset adds the outdoor temperature of the areas taxicabs visit during their services. We generate a temperature value for every active cab by applying Gaussian distribution. To fill out the parameters of the Gaussian function, we need to assign the mean mu and standard deviation sigma for every run. Therefore, we give a ground truth temperature mu for every period in every grid daily. We use data from The Weather Network (http://www.theweathernetwork.com/) to set the right ground truth to the correct period and grid. We give every taxicab a fixed error range sigma that remains the same in all its contributions. To do so, we randomly classify participant taxicabs into three classes. The first class, called "honest," consists of cabs that usually sense accurate temperature within a 10% error range from the ground truth. The population of the honest class is 145 taxicabs (50% of all participant taxicabs). The second class, called "dishonest," consists of cabs that usually sense inaccurate temperatures within a 30% error range from the ground truth. The population of the dishonest class is 72 taxicabs (25%). The third class, called "misleading," consists of the remaining participant taxicabs that are 72 (25%), that usually sense either accurate or inaccurate temperature. The data generator function makes a random decision of generating accurate or inaccurate temperature for each taxicab among the misleading class. The latter type plays a significant role in applying the data on a system, such as the participants' reputation system, since the accuracy of their contributions is not even. As a result, each taxicab has a sensed temperature contribution based on its fixed error range and the ground truth of the day, period and grid of its location.
Mohannad A. Alswailim, Hossam S. Hassanein, Mohammad Zulkernine, 2022, "CRAWDAD queensu/crowd_temperature," IEEE Dataport, doi: https://dx.doi.org/10.15783/C7CG65
The CRAWDAD Roma/taxi dataset's original dataset comprises each taxicab's position using GPS. This dataset adds the outdoor temperature of the areas taxicabs visit during their services. We generate a temperature value for every active cab by applying Gaussian distribution. To fill out the parameters of the Gaussian function, we need to assign the mean mu and standard deviation sigma for every run. Therefore, we give a ground truth temperature mu for every period in every grid daily. We use data from The Weather Network (http://www.theweathernetwork.com/) to set the right ground truth to the correct period and grid. We give every taxicab a fixed error range sigma that remains the same in all its contributions. To do so, we randomly classify participant taxicabs into three classes. The first class, called "honest," consists of cabs that usually sense accurate temperature within a 10% error range from the ground truth. The population of the honest class is 145 taxicabs (50% of all participant taxicabs). The second class, called "dishonest," consists of cabs that usually sense inaccurate temperatures within a 30% error range from the ground truth. The population of the dishonest class is 72 taxicabs (25%). The third class, called "misleading," consists of the remaining participant taxicabs that are 72 (25%), that usually sense either accurate or inaccurate temperature. The data generator function makes a random decision of generating accurate or inaccurate temperature for each taxicab among the misleading class. The latter type plays a significant role in applying the data on a system, such as the participants' reputation system, since the accuracy of their contributions is not even. As a result, each taxicab has a sensed temperature contribution based on its fixed error range and the ground truth of the day, period and grid of its location.
Mohannad A. Alswailim, Hossam S. Hassanein, Mohammad Zulkernine, 2022, "CRAWDAD queensu/crowd_temperature," IEEE Dataport, doi: https://dx.doi.org/10.15783/C7CG65
Resource Usage of Applications Running on Raspberry Pi Devices
The collection and construction of this dataset were organized by the Queen's Telecommunications Research Lab (TRL) and led by Ruslan Kain, a Ph.D. student at TRL. The dataset includes dynamic resource usage information associated with running edge-native applications on a set of four heterogeneous Raspberry Pi 4 Devices. The four Raspberry Pi 4 devices have 2, 4, and 8 GB RAM sizes, and CPU frequencies of 1200, 1500, and 1800 MHz. This is to establish heterogeneity of the devices used and collected data and to enable data-based applications for Edge Computing Research. The resource usage measurements have a five-second granularity. We managed to collect more than 550 thousand unique data points representing the 768 hours of running applications on Raspberry Pi Devices. Our dataset is publicly available on the Borealis platform in an effort to help other researchers in the field conduct edge computing resource usage analysis. The dataset size is around 444 MB, consisting of 74 comma-separated values (CSV) files. Check the README file for the full details on the structure and content of the dataset.
Ruslan Kain; Sara A. Elsayed; Yuanzhu Chen; Hossam S. Hassanein, 2022, "Resource Usage of Applications Running on Raspberry Pi Devices", https://doi.org/10.5683/SP3/GOZAJE, Borealis, V1, UNF:6:FjVtgSYUu2Iy08LQ2ra6fQ== [fileUNF]
The collection and construction of this dataset were organized by the Queen's Telecommunications Research Lab (TRL) and led by Ruslan Kain, a Ph.D. student at TRL. The dataset includes dynamic resource usage information associated with running edge-native applications on a set of four heterogeneous Raspberry Pi 4 Devices. The four Raspberry Pi 4 devices have 2, 4, and 8 GB RAM sizes, and CPU frequencies of 1200, 1500, and 1800 MHz. This is to establish heterogeneity of the devices used and collected data and to enable data-based applications for Edge Computing Research. The resource usage measurements have a five-second granularity. We managed to collect more than 550 thousand unique data points representing the 768 hours of running applications on Raspberry Pi Devices. Our dataset is publicly available on the Borealis platform in an effort to help other researchers in the field conduct edge computing resource usage analysis. The dataset size is around 444 MB, consisting of 74 comma-separated values (CSV) files. Check the README file for the full details on the structure and content of the dataset.
Ruslan Kain; Sara A. Elsayed; Yuanzhu Chen; Hossam S. Hassanein, 2022, "Resource Usage of Applications Running on Raspberry Pi Devices", https://doi.org/10.5683/SP3/GOZAJE, Borealis, V1, UNF:6:FjVtgSYUu2Iy08LQ2ra6fQ== [fileUNF]
4G LTE User Equipment Measurements Along Kingston Transit 502 Bus Route
The collection and construction of this dataset is part of an exhaustive data collection campaign organized by the Queen's Telecommunications Research Lab (TRL) and lead by Habiba Elsherbiny, a former MSc. student at TRL. The dataset includes several 4G LTE UE-related wireless network parameters logged using Android phones while on the bus. The data was collected along the Kingston Transit Express Bus 502 public bus route in Kingston, Ontario, Canada. To the best of our knowledge, this is the first extensive analysis to be carried out over 4G LTE networks along public transportation in a midsize city like Kingston reflecting the various dynamics of the route. We managed to collect more than 190 thousand unique data points representing 30 trips covering a total of 700 km in over 30 hours. We made the dataset publicly available on Dataverse platform in an effort to help other researchers in the field conduct cellular network analysis.
Habiba Elsherbiny; Ahmad M. Nagib; Hossam S. Hassanein, 2020, "4G LTE User Equipment Measurements Along Kingston Transit 502 Bus Route" https://doi.org/10.5683/SP2/EQWKO1, Borealis, V1, UNF:6:Z3SWRmJ+DStHqfs9AATGPg==[fileUNF]
The collection and construction of this dataset is part of an exhaustive data collection campaign organized by the Queen's Telecommunications Research Lab (TRL) and lead by Habiba Elsherbiny, a former MSc. student at TRL. The dataset includes several 4G LTE UE-related wireless network parameters logged using Android phones while on the bus. The data was collected along the Kingston Transit Express Bus 502 public bus route in Kingston, Ontario, Canada. To the best of our knowledge, this is the first extensive analysis to be carried out over 4G LTE networks along public transportation in a midsize city like Kingston reflecting the various dynamics of the route. We managed to collect more than 190 thousand unique data points representing 30 trips covering a total of 700 km in over 30 hours. We made the dataset publicly available on Dataverse platform in an effort to help other researchers in the field conduct cellular network analysis.
Habiba Elsherbiny; Ahmad M. Nagib; Hossam S. Hassanein, 2020, "4G LTE User Equipment Measurements Along Kingston Transit 502 Bus Route" https://doi.org/10.5683/SP2/EQWKO1, Borealis, V1, UNF:6:Z3SWRmJ+DStHqfs9AATGPg==[fileUNF]