Το work with title HC-CART: a parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system by Chrysos Grigorios, Dagritzikos Panagiotis, Papaefstathiou Ioannis, Dollas Apostolos is licensed under Creative Commons Attribution 4.0 International
Bibliographic Citation
G. Chrysos, P. Dagritzikos, I. Papaefstathiou and A. Dollas, "HC-CART: α parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system", ACM Trans. Architect. Code Optim., vol. 9, no. 4, Jan. 2013. doi:10.1145/2400682.2400706
https://doi.org/10.1145/2400682.2400706
Data mining is a new field of computer science with a wide range of applications. Its goal is to extract knowledge from massive datasets in a human-understandable structure, for example, the decision trees. In this article we present an innovative, high-performance, system-level architecture for the Classification And Regression Tree (CART) algorithm, one of the most important and widely used algorithms in the data mining area. Our proposed architecture exploits parallelism at the decision variable level, and was fully implemented and evaluated on a modern high-performance reconfigurable platform, the Convey HC-1 server, that features four FPGAs and a multicore processor. Our FPGA-based implementation was integrated with the widely used “rpart” software library of the R project in order to provide the first fully functional reconfigurable system that can handle real-world large databases. The proposed system, named HC-CART system, achieves a performance speedup of up to two orders of magnitude compared to well-known single-threaded data mining software platforms, such as WEKA and the R platform. It also outperforms similar hardware systems which implement parts of the complete application by an order of magnitude. Finally, we show that the HC-CART system offers higher performance speedup than some other proposed parallel software implementations of decision tree construction algorithms.