On the Relative Cost of Sampling for Join Selectivity Estimation.
Peter J. Haas, Jeffrey F. Naughton, Arun N. Swami:
On the Relative Cost of Sampling for Join Selectivity Estimation.
PODS 1994: 14-24@inproceedings{DBLP:conf/pods/HaasNS94,
author = {Peter J. Haas and
Jeffrey F. Naughton and
Arun N. Swami},
title = {On the Relative Cost of Sampling for Join Selectivity Estimation},
booktitle = {Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium
on Principles of Database Systems, May 24-26, 1994, Minneapolis,
Minnesota},
publisher = {ACM Press},
year = {1994},
isbn = {0-89791-642-5},
pages = {14-24},
ee = {http://doi.acm.org/10.1145/182591.182594, db/conf/pods/pods94-14.html},
crossref = {DBLP:conf/pods/94},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
We compare the cost of estimating the selectivity of a "star join" using
sampling procedure t_cross to the cost of simply computing the join and
obtaining the exact answer. Our bounds and approximations for the relative
cost of sampling show how this cost depends on the size of the input
relations, the number of input relations, and the precision criterion used
by the estimation procedure. We also demonstrate the deleterious effect of
dangling tuples and the mixed effect of data skew on the relative cost of
sampling. These results provide insight into when sampling should or should
not be used for join selectivity estimation.
Copyright © 1994 by the ACM,
Inc., used by permission. Permission to make
digital or hard copies is granted provided that
copies are not made or distributed for profit or
direct commercial advantage, and that copies show
this notice on the first page or initial screen of
a display along with the full citation.
Load The ACM SIGMOD Anthology, CDROM Edition, Volume 1-3, PODS '82-'98.
and ...
Load The ACM SIGMOD Anthology, Silver Edition, DVD 1, Proceedings.
and ...
Printed Edition
Proceedings of the Thirteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, May 24-26, 1994, Minneapolis, Minnesota.
ACM Press 1994, ISBN 0-89791-642-5
Contents
[Abstract and Index Terms]
[Full Text in PDF Format, 924 KB]
References
- [DNS92]
- David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri:
Practical Skew Handling in Parallel Joins.
VLDB 1992: 27-40
- [HN+93a]
- Peter J. Haas, Jeffrey F. Naughton, S. Seshadri, Arun N. Swami:
Fixed-Precision Estimation of Join Selectivity.
PODS 1993: 190-201
- [HN+93b]
- ...
- [HS92a]
- Peter J. Haas, Arun N. Swami:
Sequential Sampling Procedures for Query Size Estimation.
SIGMOD Conference 1992: 341-350
- [HS92b]
- ...
- [HOT88]
- Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja:
Statistical Estimators for Relational Algebra Expressions.
PODS 1988: 276-287
- [HOT89]
- Wen-Chi Hou, Gultekin Özsoyoglu, Baldeo K. Taneja:
Processing Aggregate Relational Queries with Hard Time Constraints.
SIGMOD Conference 1989: 68-77
- [HOD91]
- Wen-Chi Hou, Gultekin Özsoyoglu, Erdogan Dogdu:
Error-Constraint COUNT Query Evaluation in Relational Databases.
SIGMOD Conference 1991: 278-287
- [LN89]
- Richard J. Lipton, Jeffrey F. Naughton:
Estimating the Size of Generalized Transitive Closures.
VLDB 1989: 165-171
- [LN90]
- Richard J. Lipton, Jeffrey F. Naughton:
Query Size Estimation by Adaptive Sampling.
PODS 1990: 40-46
- [LNS90]
- Richard J. Lipton, Jeffrey F. Naughton, Donovan A. Schneider:
Practical Selectivity Estimation through Adaptive Sampling.
SIGMOD Conference 1990: 1-11
- [OR86]
- Frank Olken, Doron Rotem:
Simple Random Sampling from Relational Databases.
VLDB 1986: 160-169
- [OR89]
- Frank Olken, Doron Rotem:
Random Sampling from B+ Trees.
VLDB 1989: 269-277
- [ORX90]
- Frank Olken, Doron Rotem, Ping Xu:
Random Sampling from Hash Files.
SIGMOD Conference 1990: 375-386
- [Olk93]
- ...
- [Ses92]
- ...
Copyright © Fri Mar 12 17:19:57 2010
by Michael Ley (ley@uni-trier.de)