Generalised Hash Teams for Join and Group-by.

Alfons Kemper, Donald Kossmann, Christian Wiesner: Generalised Hash Teams for Join and Group-by. VLDB 1999: 30-41

@inproceedings{DBLP:conf/vldb/KemperKW99,
  author    = {Alfons Kemper and
               Donald Kossmann and
               Christian Wiesner},
  editor    = {Malcolm P. Atkinson and
               Maria E. Orlowska and
               Patrick Valduriez and
               Stanley B. Zdonik and
               Michael L. Brodie},
  title     = {Generalised Hash Teams for Join and Group-by},
  booktitle = {VLDB'99, Proceedings of 25th International Conference on Very
               Large Data Bases, September 7-10, 1999, Edinburgh, Scotland,
               UK},
  publisher = {Morgan Kaufmann},
  year      = {1999},
  isbn      = {1-55860-615-7},
  pages     = {30-41},
  ee        = {db/conf/vldb/KemperKW99.html},
  crossref  = {DBLP:conf/vldb/99},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

We propose a new class of algorithms that can be used to speed up the execution of multi-way join queries or of queries that involve one or more joins and a group-by. These new evaluation techniques allow to perform several hash-based operations (join and grouping) in one pass without repartitioning intermediate results. These techniques work particularly well for joining hierarchical structures, e.g., for evaluating functional join chains along key/foreign-key relationships. The idea is to generalize the concept of hash teams as proposed by Graefe et.al [GBC98] by indirectly partitioning the input data. Indirect partitioning means to partition the input data on an attribute that is not directly needed for the next hash-based operation, and it involves the construction of bitmaps to approximate the partitioning for the attribute that is needed in the next hash-based operation. Our performance experiments show that such generalized hash teams perform significantly better than conventional strategies for many common classes of decision support queries.

Copyright © 1999 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

Download PDF file (www.vldb.org, Darmstadt, Germany)
Download PDF file (www.acm.org, New York, USA)

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the DVD and click on the path of your mount point:
/Anthology/aDVD1 or /dvd

Printed Edition

Malcolm P. Atkinson, Maria E. Orlowska, Patrick Valduriez, Stanley B. Zdonik, Michael L. Brodie (Eds.): VLDB'99, Proceedings of 25th International Conference on Very Large Data Bases, September 7-10, 1999, Edinburgh, Scotland, UK. Morgan Kaufmann 1999, ISBN 1-55860-615-7
Contents

References

[Bab79]: Edward Babb: Implementing a Relational Database by Means of Specialized Hardware. ACM Trans. Database Syst. 4(1): 1-29(1979)
[Blo70]: Burton H. Bloom: Space/Time Trade-offs in Hash Coding with Allowable Errors. Commun. ACM 13(7): 422-426(1970)
[Bra84]: Kjell Bratbergsengen: Hashing Methods and Relational Algebra Operations. VLDB 1984: 323-333
[Car75]: Alfonso F. Cardenas: Analysis and Performance of Inverted Data Base Structures. Commun. ACM 18(5): 253-263(1975)
[CI98]: Chee Yong Chan, Yannis E. Ioannidis: Bitmap Index Design and Evaluation. SIGMOD Conference 1998: 355-366
[CKK98]: ...
[CM95]: Sophie Cluet, Guido Moerkotte: Efficient Evaluation of Aggregates on Bulk Types. DBPL 1995: 8
[CS89]: Walter W. Chang, Hans-Jörg Schek: A Signature Access Method for the Starburst Database System. VLDB 1989: 145-153
[CS94]: Surajit Chaudhuri, Kyuseok Shim: Including Group-By in Query Optimization. VLDB 1994: 354-366
[GBC98]: Goetz Graefe, Ross Bunker, Shaun Cooper: Hash Joins and Hash Teams in Microsoft SQL Server. VLDB 1998: 86-97
[GO95]: Patrick E. O'Neil, Goetz Graefe: Multi-Table Joins Through Bitmapped Join Indices. SIGMOD Record 24(3): 8-11(1995)
[Gra93]: Goetz Graefe: Query Evaluation Techniques for Large Databases. ACM Comput. Surv. 25(2): 73-170(1993)
[Gra94]: Goetz Graefe: Sort-Merge-Join: An Idea Whose Time Has(h) Passed? ICDE 1994: 406-417
[GSE+94]: Jim Gray, Prakash Sundaresan, Susanne Englert, Kenneth Baclawski, Peter J. Weinberger: Quickly Generating Billion-Record Synthetic Databases. SIGMOD Conference 1994: 243-252
[HCLS97]: Laura M. Haas, Michael J. Carey, Miron Livny, Amit Shukla: Seeking the Truth About ad hoc Join Costs. VLDB J. 6(3): 241-256(1997)
[HM97]: Sven Helmer, Guido Moerkotte: Evaluation of Main Memory Join Algorithms for Joins with Set Comparison Join Predicates. VLDB 1997: 386-395
[HR96]: Evan P. Harris, Kotagiri Ramamohanarao: Join Algorithm Costs Revisited. VLDB J. 5(1): 64-84(1996)
[HWM98]: Sven Helmer, Till Westmann, Guido Moerkotte: Diag-Join: An Opportunistic Join Algorithm for 1:N Relationships. VLDB 1998: 98-109
[Lar97]: ...
[O'N87]: Patrick E. O'Neil: Model 204 Architecture and Performance. HPTS 1987: 40-59
[OQ97]: Patrick E. O'Neil, Dallan Quass: Improved Query Performance with Variant Indexes. SIGMOD Conference 1997: 38-49
[RRS91]: ...
[Sha86]: Leonard D. Shapiro: Join Processing in Database Systems with Large Main Memories. ACM Trans. Database Syst. 11(3): 239-264(1986)
[TPC95]: ...
[Ull89]: Jeffrey D. Ullman: Principles of Database and Knowledge-Base Systems, Volume II. Computer Science Press 1989, ISBN 0-7167-8162-X
Contents
[VG84]: Patrick Valduriez, Georges Gardarin: Join and Semijoin Algorithms for a Multiprocessor Database Machine. ACM Trans. Database Syst. 9(1): 133-161(1984)
[WB98]: Ming-Chuan Wu, Alejandro P. Buchmann: Encoded Bitmap Indexing for Data Warehouses. ICDE 1998: 220-230
[VL94]: Weipeng P. Yan, Per-Åke Larson: Performing Group-By before Join. ICDE 1994: 89-100