Ιδρυματικό Αποθετήριο
Πολυτεχνείο Κρήτης
EN  |  EL

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

Tree-pattern similarity estimation for scalable content-based routing

Chand Raphael, Felber Pascal, Garofalakis Minos

Απλή Εγγραφή


URIhttp://purl.tuc.gr/dl/dias/F2852DF7-8D3C-42E0-B3E0-82B3A7D8DD83-
Αναγνωριστικόhttps://doi.org/10.1109/ICDE.2007.368960-
Αναγνωριστικόhttp://ieeexplore.ieee.org/document/4221750/-
Γλώσσαen-
ΤίτλοςTree-pattern similarity estimation for scalable content-based routingen
ΔημιουργόςChand Raphaelen
ΔημιουργόςFelber Pascalen
ΔημιουργόςGarofalakis Minosen
ΔημιουργόςΓαροφαλακης Μινωςel
ΕκδότηςInstitute of Electrical and Electronics Engineersen
ΠερίληψηWith the advent of XML as the de facto language for data publishing and exchange, scalable distribution of XML data to large, dynamic populations of consumers remains an important challenge. Content-based publish/subscribe systems offer a convenient design paradigm, as most of the complexity related to addressing and routing is encapsulated within the network infrastructure. To indicate the type of content that they are interested in, data consumers typically specify their subscriptions using a tree-pattern specification language (an important subset of XPath), while producers publish XML content without prior knowledge of any potential recipients. Discovering semantic communities of consumers with similar interests is an important requirement for scalable content-based systems: such "semantic clusters" of consumers play a critical role in the design of effective content-routing protocols and architectures. The fundamental problem underlying the discovery of such semantic communities lay in effectively evaluating the similarity of different tree-pattern subscriptions based on the observed document stream. In this paper, we propose a general framework and algorithmic tools for estimating different tree-pattern similarity metrics over continuous streams of XML documents. In a nutshell, our approach relies on continuously maintaining a novel, concise synopsis structure over the observed document stream that allows us to accurately estimate the fraction of documents satisfying various Boolean combinations of different tree-pattern subscriptions. To effectively capture different branching and correlation patterns within a limited amount of space, our techniques use ideas from hash-based sampling in a novel manner that exploits the hierarchical structure of our document synopsis. Experimental results with various XML data streams verify the effectiveness of our approach.en
ΤύποςΠλήρης Δημοσίευση σε Συνέδριοel
ΤύποςConference Full Paperen
Άδεια Χρήσηςhttp://creativecommons.org/licenses/by/4.0/en
Ημερομηνία2015-11-30-
Ημερομηνία Δημοσίευσης2007-
Θεματική ΚατηγορίαData engineeringen
Βιβλιογραφική ΑναφοράR. Chand, P. Felber and M. Garofalakis, "Tree-pattern similarity estimation for scalable content-based routing", in IEEE 23rd International Conference on Data Engineering, 2007, pp. 1016-1025. doi: 10.1109/ICDE.2007.368960en

Υπηρεσίες

Στατιστικά