go back

Volume 15, No. 12

OREO: Detection of Cherry-picked Generalizations

Authors:
Yin Lin (University of Michigan)* Brit Youngmann (MIT) Yuval Moskovitch (University of Michigan) H. V. Jagadish (University of Michigan) Tova Milo (Tel Aviv University)

Abstract

Data analytics often make sense of large data sets by generalization: aggregating from the detailed data to a more general context. Given a dataset, misleading generalizations can sometimes be drawn from a cherry-picked level of aggregation to obscure substantial sub-groups that oppose the generalization. Our goal is to detect and explain cherry-picked generalizations by refining the corresponding aggregate queries. We demonstrate OREO, a system to compute a support score of the given statement to quantify the quality of the generalization; that is, whether the aggregated result is an accurate reflection of the data. To better understand the resulting score, our system also identifies significant counterexamples and alternative statements that better represent the data at hand. We will demonstrate the utility of OREO for investigating generalizations, by interacting with the VLDB’22 participants who will use the OREO interface for statement validation and explanation.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy