CIDR Proceedings

This website is under development. If you come accross any issues, please report them to Konstantinos Kanellis (kkanellis@cs.wisc.edu) or Yannis Chronis (chronis@google.com).

Go Back

NL2SQL is a solved problem... Not!

Authors:

Avrilia Floratou, Fotis Psallidas, Fuheng Zhao, Shaleen Deep, Gunther Hagleither, Joyce Cahoon, Rana Alotaibi, Jordan Henkel, Abhik Singla, Alex van Grootel, Kai Deng, Katherine Lin, Marcos Campos, Venkatesh Emani, Vivek Pandit, Wenjing Wang, Carlo Curino

Download PDF

Abstract

The development of natural language (NL) interfaces for databases has been notably shaped by the rise of Large Language Models (LLMs), which provide an easy way to automate the translation of NL queries into structured SQL queries. While LLMs bring valuable technical advancements, this paper stresses that achieving Enterprise-Grade NL2SQL is still far from being resolved, necessitating extensive novel research in various domains. We present insights from two competing teams dedicated to delivering reliable enterprise-grade NL2SQL technology, shedding light on challenges faced in real-world applications, including handling complex schemata, dealing with ambiguity in natural language statements, and incorporating it in our benchmarking methodologies and responsible AI considerations. While this paper may raise more questions than it answers, its aim is to act as a catalyst for a fruitful discussion on the topic. Additionally, it provides a practical pathway for the community to develop enterprise-grade NL2SQL solutions.