go back

Volume 15, No. 11

On Shapley Value in Data Assemblage Under Independent Utility

Authors:
Xuan Luo (Simon Fraser University)* Jian Pei (Simon Fraser University) Zicun Cong (Simon Fraser University) Cheng Xu (Simon Fraser University)

Abstract

The thriving success of data science and machine learning applications heavily relies on the availability of huge amounts of data. In many applications, an organization may want to acquire data from many data owners. Facilities like data marketplaces allow data owners to produce data assemblage needed by data buyers through coalition. To encourage coalitions to produce data, it is critical to allocate revenue to data owners according to their contributions in a fair manner. Although in literature Shapley fairness and alternatives have been well explored to facilitate revenue allocation in data assemblage, computing exact Shapley value for many data owners and large assembled data sets through coalition remains challenging due to the combinatoric nature of Shapley value. In this paper, we explore the decomposability of utility in data assemblage by formulating the independent utility assumption. We argue that independent utility enjoys many applications. Moreover, we identify interesting properties of independent utility and develop fast computation techniques for exact Shapley value under independent utility. Our experimental results on benchmark data sets show that our new approach not only guarantees the exactness of Shapley value, but also achieves faster computation by orders of magnitudes.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy