Το work with title Distributed real-time network intrusion detection system on apache spark by Kalosynakis Minas-Diomfeas is licensed under Creative Commons Attribution 4.0 International
Bibliographic Citation
Minas-Diomfeas Kalosynakis, "Distributed real-time network intrusion detection system on apache spark", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2021
https://doi.org/10.26233/heallink.tuc.90442
In recent years, the rapid increase of internet based services raises signif-icant information security concerns. A large amount of network trafficdata is generated on a daily basis with high speed while security threatsbecome increasingly more complex. Fast and efficient detection of in-trusive activities in such conditions is a challenging task. In order toaddress this issue, we propose a distributed intrusion detection systemthat utilizes machine learning classifiers to identify malicious networkactivity in real-time. Specifically, we use the Chi-Squared algorithm toselect important features, based on which we build Decision Tree, Ran-dom Forest, and Extreme Gradient Boosting classification models onApache Spark Big Data platform. The proposed system supports scala-bility in all of its different layers and provides a user-friendly graphicalinterface to visualize network activity. Experimental results againstthe NSL-KDD dataset demonstrate that the system can perform bi-nary classification with an area under ROC curve of 97% using theRandom Forest machine learning model.