On a model of indexability and its bounds for range queriesOn a model of indexability and its bounds for range queries
Peer-Reviewed Journal Publication
Δημοσίευση σε Περιοδικό με Κριτές
2015-10-162002enWe develop a theoretical framework to characterize the hardness of indexing data sets on block-access memory devices like hard disks. We define an indexing workload by a data set and a set of potential queries. For a workload we can construct an indexing scheme, which is a collection of fixed-sized subsets of the data. We identify two measures of efficiency for an indexing scheme on a workload: storage redundancy, r (how many times each item in the data set is stored), and access overhead, A (how many times more blocks than necessary does a query retrieve). For many interesting families of workloads, there exists a trade-off between storage redundancy and access overhead. Given a desired access overhead A, there is a minimum redundancy that any indexing scheme must exhibit. We prove a lower-bound theorem for deriving the minimum redundancy. By applying this theorem we show interesting upper and lower bounds and trade-offs between A and r in the case of multi-dimensional range queries and set queries.http://creativecommons.org/licenses/by/4.0/Journal of the ACM49135–55
Samoladas Vasilis
Σαμολαδας Βασιλης
Koutsourias Elias
Miranker, Daniel P
Hellerstein, Joseph, 1952-
Papadimitriou, Christos H
Association for Computing Machinery
Information systems
Data management systems
Data structures
Data access methods