Institutional Repository
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Dynamic decision trees in a distributed environment

Moumoulidou Zafeiria

Full record


URI: http://purl.tuc.gr/dl/dias/2BB9DC34-CF39-4B54-8EE4-1B3F0C172142
Year 2018
Type of Item Diploma Work
License
Details
Bibliographic Citation Zafeiria Moumoulidou, "Dynamic decision trees in a distributed environment", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2018 https://doi.org/10.26233/heallink.tuc.78611
Appears in Collections

Summary

Decision trees is one the most popular methods in data mining since the intuition behind themodels produced is close to human way of thinking. In particular, we focus on the streamprocessing model which belongs to one of the most realistic schemes since the volume and theproduction rate of data most of the time make the traditional processing methods ineffective. Inthis thesis we study the state-of-the-art Hoeffding Tree algorithm designed for building decisiontree models over high speed data streams. More precisely, one of the most significant challengesin streaming decision trees, is that each instance of data is processed only once and it is notstored in memory. Thus, any decision regarding the growth of the tree should be made basedonly on a subset of the original data. In paraller, we study the geometric approach for monitoringthreshold functions over distributed streams. In the aformentioned distributed setting, the dataneeded to compute the values of a function is split among diverse processing sites. So theauthors design a monitoring scheme, where the sites do not need to send their data to a centralnode in order to detect whether the value of a function has crossed a threshold; as a result theymanage to reduce the communication load. Finally, we propose a novel distributed algorithmfor mining high-speed data streams, based on the state-of-the-art Hoeffding Tree algorithm andthe ideas introduced in the geometric method.

Available Files

Services

Statistics