Is FPGA Useful for Hash Joins?
Abstract
Benefiting from the fine-grained parallelism and energy efficiency, heterogeneous computing platforms featuring FPGAs are becoming more and more common in data centers. The hash join is one of the most costly operators in database systems and accelerating the hash join as a whole task on discrete FPGA platforms has been explored for a long time. Recently, the emerging coupled CPU-FPGA architectures enable flexibility for efficient task placement between the CPU and the FPGA by omitting the high synchronization overhead introduced by CPU to device data copy and high latency of on-board PCIe bus. However, the opportunities it brings to hash joins are still under-explored. In this paper, we explore the hash join acceleration on such a platform with the OpenCL high-level synthesis design methodology. We quantitatively analyze the performance of different workload placements between CPU and FPGA with a roofline model and propose the best design on current hardware. We also point out that the current major obstacle for accelerating hash joins on the FPGA is the memory bandwidth. Accordingly, we forecast the required architectural features for the future CPU-FPGA platforms for database applications.