Searching Large Lexicons for Partially Specified Terms using Compressed Inverted Files.
Justin Zobel, Alistair Moffat, Ron Sacks-Davis:
Searching Large Lexicons for Partially Specified Terms using Compressed Inverted Files.
VLDB 1993: 290-301@inproceedings{DBLP:conf/vldb/ZobelMS93,
author = {Justin Zobel and
Alistair Moffat and
Ron Sacks-Davis},
editor = {Rakesh Agrawal and
Se{\'a}n Baker and
David A. Bell},
title = {Searching Large Lexicons for Partially Specified Terms using
Compressed Inverted Files},
booktitle = {19th International Conference on Very Large Data Bases, August
24-27, 1993, Dublin, Ireland, Proceedings},
publisher = {Morgan Kaufmann},
year = {1993},
isbn = {1-55860-152-X},
pages = {290-301},
ee = {db/conf/vldb/ZobelMS93.html},
crossref = {DBLP:conf/vldb/93},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Abstract
There are many advantages to be gained by storing the lexicon of a full text database in main memory.
In this paper we describe how to use a compressed inverted file index to searchsuch a lexicon for entries that match a pattern or partially specified term.
This method provides an effective compromise between speed and space, running orders of magnitude faster than brute force search, but requiring less memory than other pattern - matching data structures; indeed, in some cases requiring less memory than would be consumed by a single pointer to each string.
The pattern search method is based on text indexing techniques and is a successful adaptation of inverted files to main memory databases.
Copyright © 1993 by the VLDB Endowment.
Permission to copy without fee all or part of this material is granted provided that the copies are not made or
distributed for direct commercial advantage, the VLDB
copyright notice and the title of the publication and
its date appear, and notice is given that copying
is by the permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires
a fee and/or special permission from the Endowment.
Online Paper
CDROM Version: Load the CDROM "Volume 1 Issue 5, VLDB '89-'97" and ...
DVD Version: Load ACM SIGMOD Anthology DVD 1" and ...
Printed Edition
Rakesh Agrawal, Seán Baker, David A. Bell (Eds.):
19th International Conference on Very Large Data Bases, August 24-27, 1993, Dublin, Ireland, Proceedings.
Morgan Kaufmann 1993, ISBN 1-55860-152-X
Contents
References
- [1]
- Alfred V. Aho, Margaret J. Corasick:
Efficient String Matching: An Aid to Bibliographic Search.
Commun. ACM 18(6): 333-340(1975)
- [2]
- Abraham Bookstein, Shmuel T. Klein:
Compression of a Set of Correlated Bitmaps.
SIGIR 1991: 63-71
- [3]
- Abraham Bookstein, Shmuel T. Klein, Donald A. Ziff:
A Systematic Approach to Compressing a Full-Text Retrieval System.
Inf. Process. Manage. 28(6): 795-(1992)
- [4]
- ...
- [5]
- Gordon V. Cormack, R. Nigel Horspool, M. Kaiserswerth:
Practical Perfect Hashing.
Comput. J. 28(1): 54-58(1985)
- [6]
- ...
- [7]
- Edward A. Fox, Lenwood S. Heath, Qi Fan Chen, Amjad M. Daoud:
Practical Minimal Perfect Hash Functions for Large Databases.
Commun. ACM 35(1): 105-121(1992)
- [8]
- ...
- [9]
- ...
- [10]
- ...
- [11]
- ...
- [12]
- ...
- [13]
- Edward M. McCreight:
A Space-Economical Suffix Tree Construction Algorithm.
J. ACM 23(2): 262-272(1976)
- [14]
- Alistair Moffat:
Economical Inversion of Large Text Files.
Computing Systems 5(2): 125-139(1992)
- [15]
- ...
- [16]
- Alistair Moffat, Justin Zobel:
Parameterised Compression for Sparse Bitmaps.
SIGIR 1992: 274-285
- [17]
- Donald R. Morrison:
PATRICIA - Practical Algorithm To Retrieve Information Coded in Alphanumeric.
J. ACM 15(4): 514-534(1968)
- [18]
- ...
- [19]
- O. Owolabi, D. R. McGregor:
Fast Approximate String Matching.
Softw., Pract. Exper. 18(4): 387-393(1988)
- [20]
- ...
- [21]
- Jukka Teuhola:
A Compression Method for Clustered Bit-Vectors.
Inf. Process. Lett. 7(6): 308-311(1978)
- [22]
- Esko Ukkonen:
Approximate String Matching with q-grams and Maximal Matches.
Theor. Comput. Sci. 92(1): 191-211(1992)
- [23]
- ...
- [24]
- Justin Zobel, Alistair Moffat, Ron Sacks-Davis:
An Efficient Indexing Technique for Full Text Databases.
VLDB 1992: 352-362
Copyright © Tue Mar 16 02:22:03 2010
by Michael Ley (ley@uni-trier.de)