Institutional Repository
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Optimal histograms for limiting worst-case error propagation in the size of join results.

Christodoulakis Stavros, Yiannis E. Ioannidis

Simple record


URIhttp://purl.tuc.gr/dl/dias/A0E77513-A9A3-47D9-9553-601A84B1B522-
Identifierhttps://doi.org/10.1145/169725.169708-
Languageen-
Extent39 pagesen
TitleOptimal histograms for limiting worst-case error propagation in the size of join results.en
CreatorChristodoulakis Stavrosen
CreatorΧριστοδουλακης Σταυροςel
CreatorYiannis E. Ioannidisen
Content SummaryMany current relational database systems use some form of histograms to approximate the frequency distribution of values in the attributes of relations and on this basis estimate query result sizes and access plan costs. The errors that exist in the histogram approximations directly or transitively affect many estimates derived by the database system. We identify the class of serial histograms and its subclass of end-btased histograms; the latter is of particular interest because such histograms are used in several database systems. We concentrate on equality join queries without function symbols where each relation is joined on the same attribute(s) for all joins in which it participates. Join queries of this restricted type are called t-cllque queries. We show that the optimal histogram for reducing the worst-case error in the result size of such a query is always serial. For queries with one join and no function symbols (all of which are vacuously t-clique queries), we present results on finding the optimal serial histogram and the optimal end-biased histogram based on the query characteristics and the frequency distributions of values in the join attributes of the query relations. Finally, we prove that for t-clique queries with a very large number of joins, h~gh-bzased h zstograms (which form a subclass of end-biased histograms) are always optimal. To construct a histogram for the join attribute(s) of a relation, the values in the attribute(s) must first be sorted based on their frequency and then assigned into buckets according to the optimality results above. en
Type of ItemPeer-Reviewed Journal Publicationen
Type of ItemΔημοσίευση σε Περιοδικό με Κριτέςel
Licensehttp://creativecommons.org/licenses/by/4.0/en
Date of Item2015-10-02-
Date of Publication1993-
SubjectAsymptotically optimal serial histogramsen
Bibliographic CitationY. Ioannidis , S. Christodoulakis , " Optimal histograms for limiting worst-case error propagation in the size of join results",ACM Trans. on Datab. Syst. , vol. 18 ,no. 4 ,pp. 709-748,1993 .doi : 10.1145/169725.169708 en

Available Files

Services

Statistics