Institutional Repository
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Pro-active Automatic scaling support for Apache Flink in Kubernetes in the Cloud

Zafeirakopoulos Alexandros-Nikolaos

Full record


URI: http://purl.tuc.gr/dl/dias/B015D2A5-35D3-44F0-ABBA-C6D07796F14C
Year 2022
Type of Item Diploma Work
License
Details
Bibliographic Citation Alexandros-Nikolaos Zafeirakopoulos, "Pro-active Automatic scaling support for Apache Flink in Kubernetes in the Cloud", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2022 https://doi.org/10.26233/heallink.tuc.95102
Appears in Collections

Summary

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. It executes arbitrary dataflow programs in a data-parallel and pipelined manner in event-driven applications such as, fraud detection (i.e. detection of suspicious transactions), anomaly detection (i.e. detection of rare or suspicions events), rule-based alerting (i.e. identification of data which satisfy one or more rules) and many more. Despite its versatility, Apache Flink cannot automatically and optimally adjust the utilization of its underlying computing resources when streaming sources produce data at varying speeds. In order to address this issue, we describe an autonomous agent to support dynamic autoscaling for Apache Flink on Kubernetes. This agent monitors, models and adjusts Flink's behaviour by optimally modifying its allocated resources in order to match the incoming workload while achieving minimum cost. The decision making process is based on operator idleness and changes to the input's record lag. We prove that our model not only successfully maintains the performance of the application while minimizing infrastructure costs, but can provide a better performance-to-cost ratio compared to already existing work on Flink autoscaling. The effectiveness of our model is supported by an exhaustive set of synthetic and real life workloads aimed to simulate a plethora of possible scenarios.

Available Files

Services

Statistics