Institutional Repository
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Scaling out streaming time series analytics on Storm

Pavlakis Nikolaos

Full record


URI: http://purl.tuc.gr/dl/dias/802769EF-66BD-411D-9E6E-BA00A61F2B6F
Year 2017
Type of Item Master Thesis
License
Details
Bibliographic Citation Nikolaos Pavlakis, "Scaling out streaming time series analytics on Storm", Master Thesis, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2017 https://doi.org/10.26233/heallink.tuc.67653
Appears in Collections

Summary

Data can provide meaningful insights, if we are able to process it. We live in a time wherethe rate with which data is being generated grows exponentially, and extracting usefulinformation from all this data, becomes harder and harder, thus mandating efficient andscalable data analytics solutions. Oftentimes, the input data to analytics applicationsis in the form of massive, continuous data streams. Consider the example of the globalstock markets: An interesting piece of information for traders, portfolio managers, andso on, are the correlation/dependence patterns between different market players (e.g.,equities, indexes, etc.); yet, such patterns typically change very rapidly over time, andthe information is only valuable if it becomes available in real time (e.g., for algorithmictrading). This implies that stock market data needs to be processed in a streamingfashion, typically focusing only on a sliding window of recent readings (e.g., “monitorall correlations during the last hour”). In addition, data stream processing solutions needto be scalable as there are thousands of market players, implying millions of possiblecorrelation/dependence pairs that need to be tracked in real time. This thesis introducesefficient algorithms and architectures for tackling the problem of monitoring the pair-wise dependence among thousands of data streams, and introduces a generic streamprocessing framework, T-Storm, which can be used in order to easily and efficientlydevelop, scale-out, and deploy large-scale stream analytics applications.

Available Files

Services

Statistics