go back

Volume 15, No. 11

CodexDB: Synthesizing Code for Query Processing from Natural Language Instructions using GPT-3 Codex

Authors:
Immanuel Trummer (Cornell)*

Abstract

CodexDB enables users to customize SQL query processing via natural language instructions. CodexDB is based on OpenAI's GPT-3 Codex model which translates text into code. It is a framework on top of GPT-3 Codex that decomposes complex SQL queries into a series of simple processing steps, described in natural language. Processing steps are enriched with user-provided instructions and descriptions of database properties. Codex translates the resulting text into query processing code. An early prototype of CodexDB is able to generate correct code for up to 81% of queries for the WikiSQL benchmark and for up to 62% on the SPIDER benchmark.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy