Antonis Skevis, "Reversible Data Summaries and their Effect on Mining Sensor Data Streams", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2024
https://doi.org/10.26233/heallink.tuc.98646
This thesis investigates the effect of reversible data summary methods in the context of mining sensor data streams. The study explores four well-established methods, namely Discrete Fourier Transform (DFT), Discrete Cosine Transform (DCT), Discrete Wavelet Transform (DWT), and Piecewise Aggregate Approximation (PAA), alongside a novel Reversible Random Hyperplane Projection method developed within the scope of this thesis.As part of this research, we apply the aforementioned compression techniques to two datasets that contain sensor measurements. We employ tumbling windows of varying sizes and leverage an array of data mining techniques. These include diverse clustering methods such as K-Means and DBSCAN (Density-Based Spatial Clustering of Applications with Noise), regression algorithms like Linear and Logistic Regression, and various classification approaches such as K-NN (K-Nearest Neighbors), SVM (Support Vector Machines) and a Neural Network. We apply these techniques to both raw and compressed datasets, ensuring a comprehensive analysis of the data. The study aims to assess how well each compression method retains the accuracy of results compared to the original, raw, data. By directly comparing these methods, we gain insights into their effectiveness in retaining crucial information for data mining purposes within the compressed representations.Additionally, the study explores real-world network effects of the involved compression techniques by simulating a sensor network using TOSSIM. This simulation evaluates how data compression affects the number of bits needed to transmit a dataset and the resulting power savings. By gaining insights from these results, the research contributes to understanding how compression techniques could increase the lifetime of sensor networks, an important factor for sustainable and efficient deployments of resources, not only in sensor, but also in broader Internet of Things (IoT) settings.This detailed analysis has two goals: firstly, to compare how different data compression methods perform, and secondly, to provide practical insights for using them in real-world sensor networks.