go back

Volume 17, No. 12

Rock: Cleaning Data with both ML and Logic Rules

Authors:
Zian Bao, Binbin Bie, Wenfei Fan, Daji Li, Mengyun Li, Kaiwen Lin, Wei Lin, Peijie Liu, Peng Liu, Zhicong Lv, Mingliang Ouyang, Chenyang Sun, Shuai Tang, Yaoshu Wang, Qiyuan Wei, Xiangqian Wu, Min Xie, Jing Zhang, Zhao Runxiao, Jie Zhu, Yilin Zhu

Abstract

We demonstrate Rock, a system for cleaning relational data. Rock highlights the following unique features: (1) it extends logic rules by embedding machine learning models as predicates, to benefit from both ML and logic deduction; (2) it supports entity resolution, conflict resolution, timeliness deduction and missing data imputation in a unified process; and (3) it provides parallelly scalable algorithms for rule discovery, error detection and error correction, in batch and incremental modes. We will demonstrate Rock for its (a) easy-to-use interface, (b) scalability when cleaning large datasets, (c) accuracy for detecting and correcting errors across multiple tables, and (d) applications at banks and HR departments.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy