Adaptive data transformations for QaaS
Abstract
In today’s data-driven landscape, organizations have large amounts of semi-structured data that they want to get quick insights from. Query-as-a-Service (QaaS) systems are ideal for this use case, as the user can query the data in situ without spinning up or maintaining any dedicated infrastructure. However, these systems lack traditional database structures (e.g. indexes) making their cost and latency suboptimal in many cases. As a remedy, we show how QaaS can utilize simple data transformations (transcoding, chunking, sorting) using serverless functions (FaaS) to automatically modify input data based on data characteristics and the incoming queries to improve latency and cost. We envision that a cloud service can act as a middleware between the user and QaaS to adopt such transformations, where customers can opt in and specify their latency/cost budgets. Finally, we evaluate the various trade-offs of these transformations.