Panda: Performance Debugging for Databases using LLM Agents
Abstract
Debugging a performance issue in databases is notoriously hard. Wouldn’t it be convenient if there exists an oracle or a co-pilot for every database system which users can query in natural language (NL) — ‘what’s wrong?’, or even better— ‘How to fix it?’. Large Language Models (LLMs), like ChatGPT, seem to be a natural surrogate to this oracle given their ability to answer a wide range of questions by efficiently encoding vast amount of knowledge for e.g., a major chunk of the internet. However, prompting ChatGPT with database performance queries often results in ‘technically correct’ but highly ‘vague’ or ‘generic’ recommendations typically rendered useless and untrustworthy by experienced Database Engineers (DBEs).