An Adaptive Data Placement Scheme for Parallel Database Computer Systems.

Kien A. Hua, Chiang Lee: An Adaptive Data Placement Scheme for Parallel Database Computer Systems. VLDB 1990: 493-506

@inproceedings{DBLP:conf/vldb/HuaL90,
  author    = {Kien A. Hua and
               Chiang Lee},
  editor    = {Dennis McLeod and
               Ron Sacks-Davis and
               Hans-J{\"o}rg Schek},
  title     = {An Adaptive Data Placement Scheme for Parallel Database Computer
               Systems},
  booktitle = {16th International Conference on Very Large Data Bases, August
               13-16, 1990, Brisbane, Queensland, Australia, Proceedings},
  publisher = {Morgan Kaufmann},
  year      = {1990},
  isbn      = {1-55860-149-X},
  pages     = {493-506},
  ee        = {db/conf/vldb/HuaL90.html},
  crossref  = {DBLP:conf/vldb/90},
  bibsource = {DBLP, http://dblp.uni-trier.de}
}

Abstract

The capacity and performance of database management system (DBMS) using a conventional (von Newmann-type) computer are limited by the total I/O channel bandwidth, the aggregate processing power, and the amount of main memory. With the advent of micro-processor, memory, and communication technology, it is econominally feasible to develop a parallel database computer system. The parallel processing techniques are employed to utilize the available resources in a coordinated fashion to solve the DBMS capacity and performance problems. Relations in such an environment are declustered into fragments and spreaded across computers. To achieve the optimal performance in data processing, it is essential for each computer to have a perfectly balanced load (i.e., identical amount of data). However, fragment sizes may vary due to insertions to and deletions from a relation. To retain good performance, the system needs to periodically rebalance data loads among the computers.

In this paper, we present an adaptive data placement scheme which balances computer work loads during query processing. The entire scheme is built on top of the popular grid file structure (but not limited to grid file). The adaptivity of the scheme and its relevant features are discussed. The cost of load rebalancing is estimated. The result shows that under our assumptions, it is always beneficial to perform load rebalancing before performing a join on skewed data.

Copyright © 1990 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.

Online Paper

Download PDF file (www.vldb.org, Darmstadt, Germany)
Download PDF file (www.acm.org, New York, USA)

ACM SIGMOD Anthology

CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the CD and click on the path of your mount point:
/Anthology/vldb8997 or /cdrom

DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...

Windows: Click the letter of your CD drive
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Mac: Click here
UNIX/LINUX: mount the DVD and click on the path of your mount point:
/Anthology/aDVD1 or /dvd

Printed Edition

Dennis McLeod, Ron Sacks-Davis, Hans-Jörg Schek (Eds.): 16th International Conference on Very Large Data Bases, August 13-16, 1990, Brisbane, Queensland, Australia, Proceedings. Morgan Kaufmann 1990, ISBN 1-55860-149-X

References

[Ben79]: Jon Louis Bentley: Multidimensional Binary Search Trees in Database Applications. IEEE Trans. Software Eng. 5(4): 333-340(1979)
[Bhi88]: Anupam Bhide: An Analysis of Three Transaction Processing Architectures. VLDB 1988: 339-350
[Cop88]: George P. Copeland, William Alexander, Ellen E. Boughter, Tom W. Keller: Data Placement In Bubba. SIGMOD Conference 1988: 99-108
[Dew84]: David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael Stonebraker, David A. Wood: Implementation Techniques for Main Memory Database Systems. SIGMOD Conference 1984: 1-8
[Dew86]: David J. DeWitt, Robert H. Gerber, Goetz Graefe, Michael L. Heytens, Krishna B. Kumar, M. Muralikrishna: GAMMA - A High Performance Dataflow Database Machine. VLDB 1986: 228-237
[Kit84]: ...
[Kit89]: Masaru Kitsuregawa, Lilian Harada, Mikio Takagi: Join Strategies on KB-Tree Indexed Relations. ICDE 1989: 85-93
[Kri88]: Hans-Peter Kriegel, Bernhard Seeger: PLOP-Hashing: A Grid File without Directory. ICDE 1988: 369-376
[Lak88]: M. Seetha Lakshmi, Philip S. Yu: Effect of Skew on Join Performance in Parallel Architectures. DPDS 1988: 107-120
[Nie84]: Jürg Nievergelt, Hans Hinterberger, Kenneth C. Sevcik: The Grid File: An Adaptable, Symmetric Multikey File Structure. ACM Trans. Database Syst. 9(1): 38-71(1984)
[Ozk88]: Esen A. Ozkarahan, Cem H. Bozsahin: Join Strategies Using Data Space Partitioning. New Generation Comput. 6(1): 19-39(1988)
[Soc79]: Gary H. Sockut, Robert P. Goldberg: Database Reorganization - Principles and Practice. ACM Comput. Surv. 11(4): 371-395(1979)
[Su83]: ...
[Ter88]: ...