Ιδρυματικό Αποθετήριο
Πολυτεχνείο Κρήτης
EN  |  EL

Αναζήτηση

Πλοήγηση

Ο Χώρος μου

XML stream processing using tree-edit distance embeddings

Garofalakis Minos, Kumar Amit

Απλή Εγγραφή


URIhttp://purl.tuc.gr/dl/dias/12DCEC49-3FB5-4CFB-9587-A761D5EFB408-
Αναγνωριστικόhttp://dl.acm.org/citation.cfm?id=1061326&dl=ACM&coll=DL&CFID=723856831&CFTOKEN=21128548-
Αναγνωριστικόhttps://doi.org/10.1145/1061318.1061326-
Γλώσσαen-
Μέγεθος54 pagesen
ΤίτλοςXML stream processing using tree-edit distance embeddingsen
ΔημιουργόςGarofalakis Minosen
ΔημιουργόςΓαροφαλακης Μινωςel
ΔημιουργόςKumar Amiten
ΕκδότηςAssociation for Computing Machineryen
ΠερίληψηWe propose the first known solution to the problem of correlating, in small space, continuous streams of XML data through approximate (structure and content) matching, as defined by a general tree-edit distance metric. The key element of our solution is a novel algorithm for obliviously embedding tree-edit distance metrics into an L1 vector space while guaranteeing a (worst-case) upper bound of O(log2n log*n) on the distance distortion between any data trees with at most n nodes. We demonstrate how our embedding algorithm can be applied in conjunction with known random sketching techniques to (1) build a compact synopsis of a massive, streaming XML data tree that can be used as a concise surrogate for the full tree in approximate tree-edit distance computations; and (2) approximate the result of tree-edit-distance similarity joins over continuous XML document streams. Experimental results from an empirical study with both synthetic and real-life XML data trees validate our approach, demonstrating that the average-case behavior of our embedding techniques is much better than what would be predicted from our theoretical worst-case distortion bounds. To the best of our knowledge, these are the first algorithmic results on low-distortion embeddings for tree-edit distance metrics, and on correlating (e.g., through similarity joins) XML data in the streaming model.en
ΤύποςPeer-Reviewed Journal Publicationen
ΤύποςΔημοσίευση σε Περιοδικό με Κριτέςel
Άδεια Χρήσηςhttp://creativecommons.org/licenses/by/4.0/en
Ημερομηνία2015-10-29-
Ημερομηνία Δημοσίευσης2005-
Θεματική ΚατηγορίαXMLen
Βιβλιογραφική ΑναφοράM. Garofalakis and A. Kumar, "XML stream processing using tree-edit distance embeddings", ACM Trans. Dat. Syst., vol. 30, no. 1, pp. 279-332, Mar. 2005. doi:10.1145/1061318.1061326en

Υπηρεσίες

Στατιστικά