Το work with title A parallel algorithm for tracking dynamic communities based on apache flink by Kechagias Georgios, Tzortzis Grigorios, Paliouras, Georgios, Vogiatzis, Dimitrios is licensed under Creative Commons Attribution 4.0 International
Bibliographic Citation
G. Kechagias, G. Tzortzis, G. Paliouras and D. Vogiatzis, "A parallel algorithm for tracking dynamic communities based on apache flink," in 10th Hellenic Conference on Artificial Intelligence, 2018. doi: 10.1145/3200947.3201039
https://doi.org/10.1145/3200947.3201039
Real world social networks are highly dynamic environments consisting of numerous users and communities, rendering the tracking of their evolution a challenging problem. In this work, we propose a parallel algorithm for tracking dynamic communities between consecutive timeframes of the social network, where communities are represented as undirected graphs. Our method compares the communities based on the widely adopted Jaccard similarity measure and is implemented on top of Apache Flink, a novel framework for parallel and distributed data processing. We evaluate the benefits, in terms of execution time, that parallel processing brings to community tracking on datasets carrying different quantitative characteristics, derived from two popular social media platforms; Twitter and Mathematics Stack Exchange Q&A. Experiments show that our parallel method has the ability to calculate the similarity of communities within seconds, even for large social networks, consisting of more than 600 communities per timeframe.