This website is under development. If you come accross any issues, please report them to Konstantinos Kanellis (kkanellis@cs.wisc.edu) or Yannis Chronis (chronis@google.com).

Is Perfect Hashing Practical for OLAP Systems?

Authors:
Kevin P Gaffney, Jignesh M Patel
Abstract

A perfect hash function (PHF) maps a set of keys to a range of integers with no collisions. Compared to conventional hash methods, PHFs are attractive for their low space overhead and reduced control flow. Despite their advantages, there has been little investigation into the use of PHFs for online analytical processing (OLAP). This paper is an initial guide to practical perfect hashing for OLAP. We identify several promising applications for PHFs in OLAP and survey their current use in systems and research prototypes. We then evaluate existing PHF approaches and quantify their impact on query performance. Our results are encouraging: in a real OLAP system, PHFs achieve end-to-end speedups of 1.7X and 3.1X for join and aggregate queries, respectively. Nevertheless, there is room for improvement. Future approaches that simultaneously achieve low build time and high probe throughput could offer additional performance increases.