URI | http://purl.tuc.gr/dl/dias/C8A10160-E770-48D3-8503-C457D55AADE8 | - |
Αναγνωριστικό | http://db.cs.berkeley.edu/papers/icde10-ie.pdf | - |
Γλώσσα | en | - |
Μέγεθος | 4 pages | en |
Τίτλος | Probabilistic declarative information extraction | en |
Δημιουργός | Wang Daisy Zhe | en |
Δημιουργός | Michelakis Eirinaios | en |
Δημιουργός | Franklin Michael J. | en |
Δημιουργός | Garofalakis Minos | en |
Δημιουργός | Γαροφαλακης Μινως | el |
Δημιουργός | Hellerstein, Joseph, 1952- | en |
Περίληψη | Unstructured text represents a large fraction of the
world’s data. It often contains snippets of structured information
(e.g., people’s names and zip codes). Information Extraction
(IE) techniques identify such structured information in text. In
recent years, database research has pursued IE on two fronts:
declarative languages and systems for managing IE tasks, and
probabilistic databases for querying the output of IE. In this
paper, we make the first step to merge these two directions,
without loss of statistical robustness, by implementing a state-ofthe-art
statistical IE model – Conditional Random Fields (CRF)
– in the setting of a Probabilistic Database that treats statistical
models as first-class data objects. We show that the Viterbi
algorithm for CRF inference can be specified declaratively in
recursive SQL. We also show the performance benefits relative
to a standalone open-source Viterbi implementation. This work
opens up the optimization opportunities for queries involving
both inference and relational operators over IE models. | en |
Τύπος | Πλήρης Δημοσίευση σε Συνέδριο | el |
Τύπος | Conference Full Paper | en |
Άδεια Χρήσης | http://creativecommons.org/licenses/by/4.0/ | en |
Ημερομηνία | 2015-11-30 | - |
Ημερομηνία Δημοσίευσης | 2010 | - |
Θεματική Κατηγορία | Inforamtion systems | en |
Θεματική Κατηγορία | Databases | en |
Βιβλιογραφική Αναφορά | D. Z. Wang, E. Michelakis, M. J. Franklin, M. Garofalakis and J. M. Hellerstein, "Probabilistic declarative information extraction", in 26th IEEE International Conference on Data Engineering, 2010. | en |