Institutional Repository
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Persistency and interoperability in Synopses Data Engine: integration into Knowledge Lakes

Petrou Dimitrios

Full record


URI: http://purl.tuc.gr/dl/dias/E2FF251B-87FF-4EA0-898F-A6C338A053CF
Year 2025
Type of Item Diploma Work
License
Details
Bibliographic Citation Dimitrios Petrou, "Persistency and interoperability in Synopses Data Engine: integration into Knowledge Lakes", Diploma Work, School of Electrical and Computer Engineering, Technical University of Crete, Chania, Greece, 2025 https://doi.org/10.26233/heallink.tuc.102295
Appears in Collections

Summary

In the era of big data where real-time information is generated at an unprecedented scale, the ability to process, analyze, and extract actionable insights efficiently is con- sidered a requirement in many use cases. Stream summarization has emerged as novel technique when it comes to addressing this, enabling the creation of compact, yet in- formative, representations of continuous data streams, termed synopses, eliminating the need of storing vast amounts of raw data for future processing. A prominent effort con- ducted in this sector, is the Synopses Data Engine (SDE), an advanced framework that integrates state-of-the-art stream summarization techniques with the high-performance capabilities of Apache Flink, eventually forming an interactive summarization service at a scale. While SDE has proven its merit in other big data ecosystems, its applica- tion within knowledge-driven environments introduces new requirements. In their native form, synopses are volatile. Their lifespan depends on the runtime of the engine. How- ever, in the context of Knowledge Lakes, where long-term insights and temporal analytics are essential, the inability to retain and revisit previous states introduces a significant limitation. This thesis aims to bridge this gap by extending the capabilities of SDE to incorporate persistency and a versatile snapshot mechanism, allowing for the long-term storage and retrieval of Synopses by respecting SDE’s indigenous key features. Further- more, the work expands the Streaming API of SDE to provide broader observability into the internal state of the engine, allowing metadata to be extracted towards outer data an- alytics ecosystems. The STELAR KLMS (Knowledge Lake Management System) serves as the domain of application for this Thesis, where SDE is integrated to process real-time agri-food data for precision interventions.

Available Files

Services

Statistics