QueryGPT-ADK

QueryGPT-ADK is an open-source, multi-agent system for natural language to SQL query generation and explanation. It leverages LLMs and vector search to help users convert natural language questions into SQL queries, explain them, and validate them, making database analytics accessible to everyone.

🚀 Demo

QueryGPT-ADK in action:

Enter a natural language question (e.g., "Show me the top 5 medicines by usage")
Authenticate via registration/login
See the generated SQL, explanation, and results
Handles rate limiting, account lockout, and error responses

➡️ Watch the demo video below to see QueryGPT-ADK in action:

➡️ See Usage Examples below for API calls and more!

Features

Natural Language to SQL: Converts user questions into SQL queries using LLM-based agents.
Multi-Agent Architecture: Modular agents for query formation, explanation, table/column selection, and validation.
Automatic Schema Extraction & Table Description:
- New agent (sql_table_describer_agent) for extracting and describing SQL table schemas, including sub-agents for table-level tasks.
- Includes a script to auto-generate JSON documentation of all tables, views, and enums from your database (generate_tables_json.py).
Robust SQL Query Validation: Ensures only valid SELECT statements are executed, supporting both MySQL and PostgreSQL.
Vector Search Integration: Uses Qdrant for semantic search over sample queries and table schemas.
Validation: Ensures only valid SELECT queries are generated and executed.
Production-Ready Security:
- Redis-backed rate limiting (per-IP, distributed)
- Account lockout after repeated failed logins (also Redis-backed)
- JWT authentication, password hashing, and best-practice error handling
Extensible & Modular:
- Models organized in models/, infra in infra/, and all middlewares in middleware/
- Easy to add new agents, tools, or infrastructure
Next.js UI:
- Built with Next.js and TypeScript
- Provides a user-friendly interface for entering queries and viewing generated SQL and results
- Interactive UI for entering and submitting natural language queries to various databases.
LLM Provider Flexibility: Supports Gemini (default), Ollama, OpenAI, or any LLM provider by configuring environment variables in .env. See .env.example for details.
Sample Data: Includes sample queries and table schemas for demonstration and testing.

Alembic Migrations, Linting, and Formatting

Database schema changes are managed using Alembic migrations. See alembic/ for migration scripts.
Code is formatted and linted using ruff. Run make lint and make format to check and auto-fix code style.

Changelog

2025-05-23

Added sql_table_describer_agent for automated SQL table schema extraction and description.
Introduced backend/scripts/generate_tables_json.py for generating up-to-date JSON documentation of all database tables, views, and enums.
Enhanced SQL query validation tools to support both MySQL and PostgreSQL, with improved error handling.
Updated configuration and helper utilities for multi-database and LLM support.

2025-05-20

PostgreSQL Support: Added support for PostgreSQL alongside MySQL for query validation.
SQL Validation Improvements: Cleaned up JSON formatting and simplified SQL syntax validation in query formation prompts.
Requirements Update: Specified psycopg2 package version as 2.9.10 in requirements.txt.
Model/Migration Cleanup: Removed LLMCredential model and related migrations.

Installation

Prerequisites

Python 3.9+
Qdrant vector database
Redis (for rate limiting and account lockout)
Next.js (for UI)
TypeScript (for UI)
(Optional) Gemini, Ollama, OpenAI, or any other LLM API (configurable via .env)

Setup

Clone the repository:

git clone https://github.com/Shrijeeth/QueryGPT-ADK.git
cd QueryGPT-ADK

Install dependencies (Backend):

cd backend
python -m pip install -r requirements.txt

Set up environment variables (Backend):

cd backend
cp .env.example .env

Edit .env and set your database and Redis URLs as needed. Example:

POSTGRES_DB_URL=postgresql+asyncpg://user:password@localhost:5432/querygpt
REDIS_URL=redis://localhost:6379/1

Generate up-to-date table and enum documentation (Backend):

cd backend
python -m scripts.generate_tables_json

This command will connect to your configured database and auto-generate the latest tables, views, and enums documentation in backend/scripts/data/tables.json. Before running this command, make sure your database is up and running and the connection string in .env is correct for validation database. You also need LLM provider and API key for this command to work. Set SYNTHETIC_DATA_LLM_MODEL and SYNTHETIC_DATA_LLM_API_KEY in .env for this to work.

Set up environment variables (Frontend):

cd frontend
cp .env.example .env

Edit .env and set your API base URL as needed. Example:

NEXT_PUBLIC_API_BASE_URL=http://localhost:8000

Set up the Frontend (Next.js UI):

cd frontend
npm install
npm run build
npm run dev

This will start the frontend at http://localhost:3000.

Clear Qdrant database (Optional):

cd backend
python scripts/clear_vector_db.py

Load db data into Qdrant:

cd backend
python scripts/add_table_schemas.py
python scripts/add_sample_queries_qdrant.py

(Optional) Use Your Own Data:

To use QueryGPT-ADK with your own data, replace the provided sample_queries.json and tables.json files in the backend/scripts/data/ directory with your own files. Make sure your files follow the same structure as the samples. Then, rerun the data loading scripts:

python scripts/add_table_schemas.py
python scripts/add_sample_queries_qdrant.py

Data File Structure

sample_queries.json

Contains a list of example queries for your database.
Each entry should have a description, query, and type.

Example:

{
  "data": [
    {
      "description": "Check current stock balance",
      "query": "SELECT * FROM stock_balance",
      "type": "SAMPLE_QUERY"
    },
    {
      "description": "Monthly purchase summary",
      "query": "SELECT ...",
      "type": "SAMPLE_QUERY"
    }
  ]
}

tables.json

Contains a list of tables (or views) and their schemas.
Each entry should have a name, description, schema, and type (usually TABLE).

Example:

{
  "data": [
    {
      "name": "medicines",
      "description": "Stores information about each medicine.",
      "schema": "CREATE TABLE medicines (id SERIAL PRIMARY KEY, name VARCHAR(255) NOT NULL, ...);",
      "type": "TABLE"
    },
    {
      "name": "stock_credit",
      "description": "When stock is added (purchase or restock).",
      "schema": "CREATE TABLE stock_credit (id SERIAL PRIMARY KEY, ...);",
      "type": "TABLE"
    }
  ]
}

Ensure your files follow the above structure for successful data loading and agent operation.

Usage

The application is designed to be integrated into a Google ADK, FastAPI, or Streamlit app (frontend coming soon). You can interact with the agents programmatically or extend the project for your use case.

Authentication & API Usage

The /query endpoint requires a Bearer token for authentication. Obtain a token from /token using username/password (see fake_users_db in auth.py for default credentials).

Example: Obtain Token and Query (using curl)

# Get access token
curl -X POST "http://localhost:8000/token" -d "username=testuser&password=secret"
# Use the returned access_token for authenticated query
curl -X POST "http://localhost:8000/query" -H "Authorization: Bearer <access_token>" -H "Content-Type: application/json" -d '{"query": "Show me the top 5 medicines by usage"}'

Usage Examples

1. Register a New User

curl -X POST "http://localhost:8000/register" \
  -H "Content-Type: application/json" \
  -d '{"username": "alice", "password": "secret123", "email": "[email protected]", "full_name": "Alice Smith"}'

2. Log In and Get a JWT Token

curl -X POST "http://localhost:8000/token" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=alice&password=secret123"
# Response: {"access_token": "<JWT>", "token_type": "bearer"}

3. Run a Query (Authenticated)

curl -X POST "http://localhost:8000/query" \
  -H "Authorization: Bearer <JWT>" \
  -H "Content-Type: application/json" \
  -d '{"query": "Show me the top 5 medicines by usage"}'

4. Example Error Responses

Rate Limit Exceeded:
```
{"detail": "Too Many Requests"}
```

Account Lockout:

{"detail": "Account locked due to too many failed login attempts. Try again in 299 seconds."}

Unauthorized:

{"detail": "Incorrect username or password"}

5. Interactive API Docs

Visit http://localhost:8000/docs for Swagger UI (auto-generated, try endpoints in browser)

6. Alembic Migrations & Linting

Generate a new migration after model changes:

cd backend
alembic revision --autogenerate -m "describe_change"
alembic upgrade head

Lint and format code:
```
make lint
make format
```

7. Running with Redis & Database

Make sure Redis and your database are running and accessible (see .env for connection strings).
For distributed deployments, Redis ensures rate limiting and lockout work across all app instances.

8. API Extensibility

Add new endpoints, agents, or tools in backend/ as needed. See middleware/ and infra/ for scalable patterns.

Launching ADK Web UI

Launch the ADK Web UI

From the parent directory of your agent project (e.g. QueryGPT-ADK/):
```
cd backend
adk web agents
```
- Open the URL provided in the terminal (usually http://localhost:8000 or http://127.0.0.1:8000) in your browser.
- In the top-left corner, select your agent (e.g., query_agent) from the dropdown.
- Chat with your agent using the textbox, inspect function calls, and view responses.
Run your Agent from the CLI

You can run your agent directly in the terminal:
```
cd backend
adk run agents/query_agent
```
Use Ctrl+C to exit the agent from the terminal.
Start a Local API Server

To expose your agent as a FastAPI server for local development and integration:
```
cd backend
adk api_server agents
```
This will start a FastAPI server (by default on port 8000), allowing you to send HTTP requests to your agent.

Programmatic Interaction in Python

You can also invoke your agent programmatically:

from agents.query_agent.agent import root_agent

response = root_agent.run(user_input="Show me the top 5 medicines by usage")
print(response)

Agents Overview

query_agent: Delegates tasks to sub-agents for query formation, explanation, table, and column selection.
query_explanation_agent: Finds and explains similar queries related to the user input.
table_agent: Identifies relevant tables based on the user input.
column_selection_agent: Selects appropriate columns for the query.
query_formation_agent: Generates SQL queries based on the user input and available tables and columns.
query_validation_agent: Validates the generated SQL queries for correctness and safety.

Tools

get_similar_queries_tool: Retrieves similar queries from the Qdrant database based on the user input.
get_similar_tables_tool: Retrieves similar tables from the Qdrant database based on the user input.
validate_sql_query_tool: Validates the generated SQL query.

Roadmap

Next.js UI (In Progress)

Build a modern Next.js web app for natural-language-to-SQL interactions. (Completed)
Provide a user-friendly interface for entering queries and viewing generated SQL and results. (Completed)
Display results in interactive, filterable tables. (To Do)
Add authentication, query history, and user profile management. (In Progress)
Integrate responsive design and accessibility best practices. (To Do)
Enable seamless communication with the FastAPI backend via REST API. (Completed)

FastAPI Backend (In Progress)

Develop REST API endpoints for query submission, execution, and validation. (Completed)
Implement CORS, rate limiting, and token-based authentication. (Completed)
Establish database connectivity and connection pooling. (Completed)
Provide interactive OpenAPI documentation via Swagger UI. (Completed)
Write unit and integration tests for API endpoints. (To Do)

BYOK & Model Provider Support (Completed)

Add support for BYOK (Bring Your Own Key/Model) to allow users to plug in their preferred LLM provider (e.g., OpenAI, Gemini, Ollama, Azure, local models, etc.).
Provide a configuration interface for selecting and managing model providers.
Ensure compatibility with multiple model APIs and authentication mechanisms.
Add documentation and examples for integrating new model providers.

Importing Custom Databases (To Do)

Build a flexible importer to support connecting and importing schemas from various custom databases (e.g., PostgreSQL, SQLite, Oracle, etc.).
Provide a UI and/or CLI for users to add new database connections.
Auto-detect and map table schemas and relationships from imported databases.
Validate and test queries on imported/custom databases.

Importing Custom Vector Databases (To Do)

Add support for connecting to and importing from various vector databases (e.g., Pinecone, Weaviate, Milvus, Chroma, etc.).
Provide configuration and interface for users to add/manage vector DB connections.
Enable schema and collection import for semantic search and retrieval.
Ensure integration of custom vector DBs with agentic workflow and query tools.

Generate Synthetic Data for a Database (In Progress)

Add support for generating synthetic data for a given database.
Provide a UI and/or CLI for users to generate synthetic data for their own database.
Create an agent that can generate synthetic data queries for a given database.
Add scripts to generate synthetic data and fetch schema for a given database.

Testing (To Do)

Write test cases for agentic workflow (end-to-end and agent interaction tests).

Contribution & Extending Agents

Contributions are welcome! The agent system is modular. New agents and tools can be added by following the existing pattern in backend/agents/. Please open issues or submit pull requests for bug fixes, new features, or improvements.

Fork the repository.
Create your feature branch (git checkout -b feature/your-feature).
Commit your changes (git commit -am 'Add new feature').
Push to the branch (git push origin feature/your-feature).
Open a Pull Request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml

License

Shrijeeth/QueryGPT-ADK

Folders and files

Latest commit

History

Repository files navigation

QueryGPT-ADK

🚀 Demo

Features

Alembic Migrations, Linting, and Formatting

Changelog

2025-05-23

2025-05-20

Installation

Prerequisites

Setup

Data File Structure

Usage

Authentication & API Usage

Usage Examples

1. Register a New User

2. Log In and Get a JWT Token

3. Run a Query (Authenticated)

4. Example Error Responses

5. Interactive API Docs

6. Alembic Migrations & Linting

7. Running with Redis & Database

8. API Extensibility

Launching ADK Web UI

Agents Overview

Tools

Roadmap

Next.js UI (In Progress)

FastAPI Backend (In Progress)

BYOK & Model Provider Support (Completed)

Importing Custom Databases (To Do)

Importing Custom Vector Databases (To Do)

Generate Synthetic Data for a Database (In Progress)

Testing (To Do)

Contribution & Extending Agents

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages