Christina Manara, "Stock forecasting using Apache Flink", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2021
https://doi.org/10.26233/heallink.tuc.90088
The modern age is also characterized as the age of Big Data, due to the unprecedented scale of data produced daily and the need to analyze and extract useful results in a variety of different fields. The need to monitor thousands of data streams in order to make decisions is imperative. In the stock market, an investor wants to identify potential opportunities, which is very important in dealing with this sector, as the correct and efficient processing of stock market data becomes crucial for a country’s economic prosperity. In the case of the stock market, the stock-data flows are continuous and long. This dissertation processes thousands of stock market shares distributed and simultaneously, by finding high correlations that concern sets of two shares. This process is done in realtime and aims to find shares of the k most similar, which are vital to the prediction of others, which are given as input, in order to be predicted. Inevitable and essential is the need to perform the approach in a reasonable time frame, to which the desired answers are attributed while increasing the amount of data at the input. The request is satisfied by (a) implementing and managing a synopsis in the system Synopsis Data Engine (SDE) (b) the application of the Discrete Fourier Transform (DFT), which aims to reduce the required number of candidate similar stocks (c) the application of the Multiple Linear Regression (MLR) model for stock forecasting. For the extraction of the experimental process, the algorithm is checked both locally and in a computing cluster, achieving satisfactory results.