Georgios Vionis, "Dynamic service placement in Kubernetes using reinforcement learning", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2025
https://doi.org/10.26233/heallink.tuc.98339
This thesis takes a deep dive into the space of Service Placement in Distributed Multi-Clustered Cloud environments. This work aims to reduce the operational cost of the system as well as the latency of using a Highly Scalable Reinforcement Learning model. For this model, the system follows a hierarchical directed acyclic graph structure with multiple tiers. The First tier makes up the central computational and resource infrastructure providing the most potential for scaling. Intermediate tiers are located physically closer to the outer-most tier meaning they have substantially lower propagation delay, they are however lacking when compared to the Central tier regarding compute and scaling capabilities. The aforementioned outside tier represents the final tier consisting of nodes with potentially low computational power and is the server through which clients connect to the system, whether that is IoT devices, smartphones, or any other potential consumer. All applications follow Micro-Service Architectures, meaning they are comprised of a multitude of (micro)services. This allows for moving these services across the tiers as needed to maximize performance and minimize cost and latency. As described earlier the inner-most tiers provide improved performance at the expense of increased latency. Therefore, moving these services to the intermediate or even the outer-most tier can be beneficial both cost wise and in terms of latency if the gain of reduced latency and network I/O outweighs the loss stemming from multiple deployments across many tiers for the same service. The proposed model: DynaQSP, aims to handle the placement and deletion of deployments across each tier utilizing a dynamic Q Reinforcement Learning model while also appropriately managing the traffic flow after each change. To collect metric information and manage functions and networking of applications a Service Mesh technology (Linkerd) will be utilized. DynaQSP collects this information and uses it to train our Reinforcement Learning model which in turn makes decisions regarding the management of all the services. Three realistic microservices-based applications, Google’s “Online Boutique”, “Bank of Anthos” and “TeaStore”, were deployed in the previously described heterogeneous multicluster environment on the Google Cloud Platform with the purpose of evaluating the effectiveness of the proposed model. After considerable implementation-testing-evaluation cycles the model was fine-tuned and the system latency and cost was measured for different load scenarios and compared to default behavior as well as related work. The experimental results demonstrate significant improvement over both default behavior and related work in terms of both reduced cost and latency while ensuring uptime consistency. More Specifically Latency has been reduced by up to 80% and operational costs have been cut by as much as 44%. The employment of realistic applications in a heterogeneous multicluster environment ensures practicality and relevance to the evaluation, confirming the efficacy of the DynaQSP model in real-world cases.