Institutional Repository
Technical University of Crete
EN  |  EL



My Space

Synopses for massive data: samples, histograms, wavelets, sketches

Cormode, Graham, 1977-, Garofalakis Minos, Haas Peter J., Jermaine Chris

Full record

Year 2012
Type of Item Peer-Reviewed Journal Publication
Bibliographic Citation G. Cormode, M. Garofalakis, P. Haas and C. Jermaine, "Synopses for massive data: samples, histograms, wavelets, sketches", Foundations and Trends in Databases, vol. 4, no. 1-3, pp. 1-294, 2012. doi: 10.1561/1900000004
Appears in Collections


Methods for Approximate Query Processing (AQP) are essential fordealing with massive data. They are often the only means of providinginteractive response times when exploring massive datasets, and arealso needed to handle high speed data streams. These methods proceedby computing a lossy, compact synopsis of the data, and then executingthe query of interest against the synopsis rather than the entiredataset. We describe basic principles and recent developments in AQP.We focus on four key synopses: random samples, histograms, wavelets,and sketches. We consider issues such as accuracy, space and time effi-ciency, optimality, practicality, range of applicability, error bounds onquery answers, and incremental maintenance. We also discuss the tradeoffsbetween the different synopsis types.