Efficient Snapshot Differential Algorithms for Data Warehousing.
Wilburt Labio, Hector Garcia-Molina:
Efficient Snapshot Differential Algorithms for Data Warehousing.
VLDB 1996: 63-74@inproceedings{DBLP:conf/vldb/LabioG96,
author = {Wilburt Labio and
Hector Garcia-Molina},
editor = {T. M. Vijayaraman and
Alejandro P. Buchmann and
C. Mohan and
Nandlal L. Sarda},
title = {Efficient Snapshot Differential Algorithms for Data Warehousing},
booktitle = {VLDB'96, Proceedings of 22th International Conference on Very
Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India},
publisher = {Morgan Kaufmann},
year = {1996},
isbn = {1-55860-382-4},
pages = {63-74},
ee = {db/conf/vldb/LabioG96.html},
crossref = {DBLP:conf/vldb/96},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
Detecting and extracting modifications from information sources
is an integral part of data warehousing. For unsophisticated
sources, in practice it is often necessary to infer modifications
by periodically comparing snapshots of data from the source.
Although this snapshot differential problem is closely related
to traditional joins and outerjoins, there are significant differences,
which lead to simple new algorithms. In particular, we present algorithms
that perform (possibly lossy) compression of records.
We also present a window algorithm that works very well
if the snapshots are not ``very different.''
The algorithms are studied via analysis and an implementation of two of them;
the results illustrate the potential gains achievable with the new
algorithms.
Copyright © 1996 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
T. M. Vijayaraman, Alejandro P. Buchmann, C. Mohan, Nandlal L. Sarda (Eds.):
VLDB'96, Proceedings of 22th International Conference on Very Large Data Bases, September 3-6, 1996, Mumbai (Bombay), India.
Morgan Kaufmann 1996, ISBN 1-55860-382-4
Contents
Electronic Edition
References
- [AL80]
- Michel E. Adiba, Bruce G. Lindsay:
Database Snapshots.
VLDB 1980: 86-91
- [BDGM95]
- Sergey Brin, James Davis, Hector Garcia-Molina:
Copy Detection Mechanisms for Digital Documents.
SIGMOD Conference 1995: 398-409
- [BGMF88]
- Daniel Barbará, Hector Garcia-Molina, Bernardo Feijoo:
Exploiting Symmetries for Low-Cost Comparison of File Copies.
ICDCS 1988: 471-479
- [CJS+94]
- ...
- [CRGMW96]
- Sudarshan S. Chawathe, Anand Rajaraman, Hector Garcia-Molina, Jennifer Widom:
Change Detection in Hierarchically Structured Information.
SIGMOD Conference 1996: 493-504
- [FWJ86]
- ...
- [Gol95]
- ...
- [HC94]
- Laura M. Haas, Michael J. Carey, Miron Livny, Amit Shukla:
Seeking the Truth About ad hoc Join Costs.
VLDB J. 6(3): 241-256(1997)
- [HGMW+95]
- Joachim Hammer, Hector Garcia-Molina, Jennifer Widom, Wilburt Labio, Yue Zhuge:
The Stanford Data Warehousing Project.
IEEE Data Eng. Bull. 18(2): 41-48(1995)
- [HT77]
- James W. Hunt, Thomas G. Szymanski:
A Fast Algorithm for Computing Longest Subsequences.
Commun. ACM 20(5): 350-353(1977)
- [IC94]
- ...
- [KR87]
- Bo Kähler, Oddvar Risnes:
Extending Logging for Database Snapshot Refresh.
VLDB 1987: 389-398
- [Lea86]
- Bruce G. Lindsay, Laura M. Haas, C. Mohan, Hamid Pirahesh, Paul F. Wilms:
A Snapshot Differential Refresh Algorithm.
SIGMOD Conference 1986: 53-60
- [LGM95]
- ...
- [LGM96]
- ...
- [Loh85]
- Guy M. Lohman, C. Mohan, Laura M. Haas, Dean Daniels, Bruce G. Lindsay, Patricia G. Selinger, Paul F. Wilms:
Query Processing in R*.
Query Processing in Database Systems 1985: 31-47
- [ME92]
- Priti Mishra, Margaret H. Eich:
Join Processing in Relational Databases.
ACM Comput. Surv. 24(1): 63-113(1992)
- [MW94]
- Udi Manber, Sun Wu:
GLIMPSE: A Tool to Search Through Entire File Systems.
USENIX Winter 1994: 23-32
- [SGM95]
- Narayanan Shivakumar, Hector Garcia-Molina:
SCAM: A Copy Detection Mechanism for Digital Documents.
DL 1995: 0-
- [Sha86]
- Leonard D. Shapiro:
Join Processing in Database Systems with Large Main Memories.
ACM Trans. Database Syst. 11(3): 239-264(1986)
- [Squ95]
- Cass Squire:
Data Extraction and Transformation for the Data Warehouse.
SIGMOD Conference 1995: 446-447
- [Ull89]
- Jeffrey D. Ullman:
Principles of Database and Knowledge-Base Systems, Volume II.
Computer Science Press 1989, ISBN 0-7167-8162-X
Contents - [ZGMHW95]
- Yue Zhuge, Hector Garcia-Molina, Joachim Hammer, Jennifer Widom:
View Maintenance in a Warehousing Environment.
SIGMOD Conference 1995: 316-327
Copyright © Mon Mar 15 03:55:56 2010
by Michael Ley (ley@uni-trier.de)