UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Workload-aware SQL query recommendation using retrieval-augmented generation Soltan Aghai, Ehsan

Abstract

Writing effective SQL queries remains a major obstacle for non-expert users who need to explore and analyze data. While recent advances in deep learning have enabled limited forms of SQL query recommendation, existing systems typically focus on predicting partial query structures or rely on schema-specific features. In this thesis, we present a retrieval-augmented generation (RAG) framework for recommending full SQL queries based solely on previous user queries in a session. At the core of our approach is a transition-aware dual encoder trained to retrieve the most likely next query template by capturing both semantic similarity and structural transitions across queries. This retrieval is followed by a language model that generates the full SQL query, conditioned on the retrieved template, recent query history. Our method requires no access to the database schema and adapts naturally to evolving workloads. Compared to traditional rule-based or collaborative recommendation systems, it offers a more flexible and interpretable solution that models user behavior over time. Empirical results show that our system produces contextually relevant queries, improving usability for users with limited SQL experience.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International