Materialized View Selection for Multidimensional Datasets.

Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton: Materialized View Selection for Multidimensional Datasets. VLDB 1998: 488-499

@inproceedings{DBLP:conf/vldb/ShuklaDN98,
  author    = {Amit Shukla and
               Prasad Deshpande and
               Jeffrey F. Naughton},
  editor    = {Ashish Gupta and
               Oded Shmueli and
               Jennifer Widom},
  title     = {Materialized View Selection for Multidimensional Datasets},
  booktitle = {VLDB'98, Proceedings of 24rd International Conference on Very
               Large Data Bases, August 24-27, 1998, New York City, New York,
               USA},
  publisher = {Morgan Kaufmann},
  year      = {1998},
  isbn      = {1-55860-566-5},
  pages     = {488-499},
  ee        = {db/conf/vldb/ShuklaDN98.html},
  crossref  = {DBLP:conf/vldb/98},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

To fulfill the requirement of fast interactive multidimensional data analysis, database systems precompute aggregate views on some subsets of dimensions and their corresponding hierarchies. However, the problem of what to precompute is difficult and intriguing. The leading existing algorithm, BPUS, has a running time that is polynomial in the number of views and is guaranteed to be within (0.63 - f) of optimal, where f is the fraction of available space consumed by the largest aggregate. Unfortunately, BPUS can be impractically slow, and in some instances may miss good solutions due to the coarse granularity at which it makes its decisions of what to precompute. In view of this, we study the structure of the precomputation problem and show that under certain broad conditions on the multidimensional data, an even simpler and faster algorithm, PBS, achieves the same (0.63 - f) bound. Our empirical study of the behavior of PBS shows that even when this condition does not hold, PBS picks a surprisingly good set of aggregates for precomputation. Furthermore, BPUS and other previous work has assumed that all aggregates are either precomputed in their entirety or not at all. We show that if one relaxes this and allows aggregates to be partially precomputed, not only is it possible to find solutions that are better than those found by previous algorithms, in some cases it is even possible to find solutions that are better than the solution that is 'optimal' by the previous definition.

Copyright © 1998 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

Download PDF file (www.vldb.org, Darmstadt, Germany)
Download PDF file (www.acm.org, New York, USA)

ACM SIGMOD DiSC

CDROM Version: Load the CDROM "DiSC, Volume 1 Number 1" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the CD and click on the path of your mount point:
/Anthology/Disc99_1 or /cdrom

ACM SIGMOD Anthology

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the DVD and click on the path of your mount point:
/Anthology/aDVD1 or /dvd

Printed Edition

Ashish Gupta, Oded Shmueli, Jennifer Widom (Eds.): VLDB'98, Proceedings of 24rd International Conference on Very Large Data Bases, August 24-27, 1998, New York City, New York, USA. Morgan Kaufmann 1998, ISBN 1-55860-566-5
Contents

References

[AAD+96]: Sameet Agarwal, Rakesh Agrawal, Prasad Deshpande, Ashish Gupta, Jeffrey F. Naughton, Raghu Ramakrishnan, Sunita Sarawagi: On the Computation of Multidimensional Aggregates. VLDB 1996: 506-521
[BPT97]: Elena Baralis, Stefano Paraboschi, Ernest Teniente: Materialized Views Selection in a Multidimensional Database. VLDB 1997: 156-165
[DRSN98]: Prasad Deshpande, Karthikeyan Ramasamy, Amit Shukla, Jeffrey F. Naughton: Caching Multidimensional Queries Using Chunks. SIGMOD Conference 1998: 259-270
[GBLP96]: Jim Gray, Adam Bosworth, Andrew Layman, Hamid Pirahesh: Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Total. ICDE 1996: 152-159
[GHRU97]: Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Index Selection for OLAP. ICDE 1997: 208-219
[Gupt97]: Himanshu Gupta: Selection of Views to Materialize in a Data Warehouse. ICDT 1997: 98-112
[HRU96]: Venky Harinarayan, Anand Rajaraman, Jeffrey D. Ullman: Implementing Data Cubes Efficiently. SIGMOD Conference 1996: 205-216
[RS97]: Kenneth A. Ross, Divesh Srivastava: Fast Computation of Sparse Datacubes. VLDB 1997: 116-125
[SS94]: Sunita Sarawagi, Michael Stonebraker: Efficient Organization of Large Multidimensional Arrays. ICDE 1994: 328-336
[SDNR96]: Amit Shukla, Prasad Deshpande, Jeffrey F. Naughton, Karthikeyan Ramasamy: Storage Estimation for Multidimensional Aggregates in the Presence of Hierarchies. VLDB 1996: 522-531
[SDN98]: ...
[Ull96]: Jeffrey D. Ullman: Efficient Implementation of Data Cubes Via Materialized Views. KDD 1996: 386-388
[ZDN97]: Yihong Zhao, Prasad Deshpande, Jeffrey F. Naughton: An Array-Based Algorithm for Simultaneous Multidimensional Aggregates. SIGMOD Conference 1997: 159-170