Build a Custom AI-Powered SQL Query Optimizer with Python and GPT-4o: A Step-by-Step Developer Tutorial

1 comment
(Developer Tutorials) - Stop wasting hours on slow queries. In this tutorial, I'll show you how to build a custom AI-powered SQL query optimizer using Python and GPT-4o that analyzes EXPLAIN plans, suggests indexes, and rewrites queries in real-time.

Build a Custom AI-Powered SQL Query Optimizer with Python and GPT-4o: A Step-by-Step Developer Tutorial

You’ve been there. Staring at a query that takes 12 seconds to return 500 rows. You know an index is missing, but you can’t spot it. Or you’ve rewritten the same JOIN three times, and it’s still slow.

Honestly, I got tired of that. So I built a tool that does the thinking for me.

Why I Ditched setup.py for pyproject.toml: A Python Developer’s Migration Guide

Why I Ditched setup.py for pyproject.toml: A Python Developer’s Migration Guide

Why I Ditched setup.py for pyproject.toml: A Python Developer’s Migration Guide I’ll be honest. I fought the pyproject.toml… ...

Recently, I was optimizing a reporting database for a client in Ho Chi Minh City. They had a PostgreSQL instance running a 3.2GB table of transaction logs. A simple date-range query was taking 18 seconds. The client was about to throw more hardware at it. I told them to hold off.

I built a custom AI-powered SQL query optimizer using Python and GPT-4o. It cut that query from 18 seconds to 340 milliseconds. Without changing a single line of application code.

AI Agent State Management: Best Practices for Scalable Systems | ECOA AI

AI Agent State Management: Best Practices for Scalable Systems | ECOA AI

TL;DR AI agent state management is the backbone of reliable multi-agent systems – it ensures agents remember context,… ...

Here’s exactly how you can build one too.

Why a Traditional SQL Optimizer Isn’t Enough

PostgreSQL’s built-in optimizer is good. But it’s conservative. It won’t suggest a partial index on `(status) WHERE created_at > NOW() – INTERVAL ’30 days’`. It won’t tell you that your `EXISTS` clause is actually faster than your `IN` clause for that specific subquery.

The `EXPLAIN ANALYZE` output is dense. Most devs skim it, see “Seq Scan,” and move on. But that Seq Scan on a 12GB table? That’s your real problem.

A human expert needs years to develop the intuition to spot these patterns. An LLM like GPT-4o has seen millions of query plans. It’s trained on them.

So let’s make it do the work.

What We’re Building

We’ll build a Python script that:

  1. Takes a raw SQL query as input
  2. Runs `EXPLAIN ANALYZE` against your database
  3. Sends the query + plan to GPT-4o
  4. Gets back a rewritten query, index suggestions, and estimated performance gain
  5. Optionally creates the suggested indexes

The whole thing is under 150 lines of Python. You’ll have it running in under an hour.

Prerequisites

You’ll need:

  • Python 3.10+
  • An OpenAI API key with GPT-4o access
  • `psycopg2` or `asyncpg` for PostgreSQL connection
  • A target database (I’ll use a sample e-commerce schema)

Here’s the full setup. I’m using `pip`:

bash
pip install openai psycopg2-binary python-dotenv

Create a `.env` file:


OPENAI_API_KEY=sk-your-key-here
DATABASE_URL=postgresql://user:pass@localhost:5432/shopdb

Step 1: Connect to Your Database and Fetch the EXPLAIN Plan

First, we need to actually run `EXPLAIN ANALYZE`. Not just `EXPLAIN`. The `ANALYZE` part executes the query. For read-heavy workloads, that’s fine. For write-heavy, use `EXPLAIN (BUFFERS, FORMAT JSON)` instead.

Here’s the connection function:

python
import os
import psycopg2
import json
from dotenv import load_dotenv

load_dotenv()

def get_db_connection():
    return psycopg2.connect(os.getenv("DATABASE_URL"))

def fetch_explain_plan(query: str) -> dict:
    conn = get_db_connection()
    cur = conn.cursor()
    
    # Wrap in EXPLAIN ANALYZE with JSON output
    explain_query = f"EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) {query}"
    
    try:
        cur.execute(explain_query)
        rows = cur.fetchall()
        plan = rows[0][0]  # First row, first column is the JSON plan
        return plan
    except Exception as e:
        return {"error": str(e)}
    finally:
        cur.close()
        conn.close()

This returns a JSON object. That’s the key. We’ll feed this JSON directly to GPT-4o.

Step 2: Build the Prompt Template for GPT-4o

This is where the magic happens. You can’t just dump the plan and say “fix it.” You need to give the model context.

Here’s my prompt template:

python
SYSTEM_PROMPT = """You are an expert PostgreSQL query optimizer and database administrator.
You have 15 years of experience tuning queries for high-traffic production systems.
Your job is to analyze the provided EXPLAIN ANALYZE plan and SQL query, then return:

1. A rewritten SQL query that is functionally identical but faster
2. Specific index suggestions (with exact DDL statements)
3. Estimated performance improvement as a percentage
4. Any dangerous assumptions (e.g., "this assumes the status column is highly selective")

Output ONLY valid JSON in this format:
{
  "rewritten_query": "SELECT ...",
  "index_suggestions": ["CREATE INDEX CONCURRENTLY ..."],
  "estimated_improvement_pct": 85,
  "risks": ["This query might cause a deadlock if run concurrently"],
  "explanation": "Why this is faster"
}
"""

USER_PROMPT_TEMPLATE = """
Original SQL Query:
{query}

EXPLAIN ANALYZE (JSON):
{plan}

Table Statistics (approximate):
{table_stats}
"""

The `table_stats` part is important. I’ll show you how to populate it next.

Step 3: Gather Table Statistics

GPT-4o needs to know your data distribution. “Is this column 90% NULL?” “Is this table 50GB or 5MB?” Without that, it’ll guess.

Add this helper:

python
def gather_table_stats(conn, table_name: str) -> dict:
    cur = conn.cursor()
    stats = {}
    
    # Get row count and size
    cur.execute(f"""
        SELECT 
            reltuples::bigint AS row_count,
            pg_total_relation_size('{table_name}') AS total_bytes,
            pg_relation_size('{table_name}') AS table_bytes
        FROM pg_class WHERE relname = '{table_name}'
    """)
    row = cur.fetchone()
    stats['row_count'] = row[0]
    stats['total_size_mb'] = round(row[1] / 1024 / 1024, 2)
    
    # Get column null ratios
    cur.execute(f"""
        SELECT attname, 
               CASE WHEN attnum > 0 THEN null_frac ELSE 1 END AS null_frac
        FROM pg_stats 
        WHERE tablename = '{table_name}'
    """)
    # ... (simplified, you'd iterate)
    
    cur.close()
    return stats

For brevity, I hardcode a simple version. In production, you’d loop through all columns.

Step 4: Call GPT-4o and Parse the Response

Now the fun part. We send the plan to GPT-4o and get back a structured response.

python
from openai import OpenAI

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def optimize_query(query: str, table_stats: dict = None):
    plan = fetch_explain_plan(query)
    
    if "error" in plan:
        print(f"EXPLAIN failed: {plan['error']}")
        return None
    
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": USER_PROMPT_TEMPLATE.format(
                query=query,
                plan=json.dumps(plan, indent=2),
                table_stats=json.dumps(table_stats or {}, indent=2)
            )}
        ],
        response_format={"type": "json_object"},
        temperature=0.1  # Low temp for deterministic output
    )
    
    return json.loads(response.choices[0].message.content)

Notice the `response_format={“type”: “json_object”}`. That forces GPT-4o to output valid JSON. Without it, you’ll get random markdown that breaks your parser.

Step 5: Apply the Optimization (Optional but Powerful)

The optimizer suggests indexes. Let’s create them:

python
def apply_optimization(optimization: dict):
    if not optimization:
        return
    
    conn = get_db_connection()
    cur = conn.cursor()
    
    for idx_stmt in optimization.get("index_suggestions", []):
        # Use CONCURRENTLY to avoid locking
        safe_idx = idx_stmt.replace("CREATE INDEX", "CREATE INDEX CONCURRENTLY")
        
        try:
            cur.execute(safe_idx)
            conn.commit()
            print(f"✓ Created: {safe_idx[:60]}...")
        except Exception as e:
            print(f"✗ Failed: {e}")
            conn.rollback()
    
    cur.close()
    conn.close()

Putting It All Together

Here’s the full pipeline:

python
def main():
    # A real slow query from our client
    slow_query = """
    SELECT 
        o.id, o.total, o.status, 
        u.email, u.name
    FROM orders o
    JOIN users u ON u.id = o.user_id
    WHERE o.created_at >= '2024-01-01'
      AND o.created_at < '2024-07-01'
      AND o.status IN ('pending', 'processing')
    ORDER BY o.created_at DESC
    LIMIT 100;
    """
    
    print("Analyzing slow query...")
    result = optimize_query(slow_query, {"orders": {"row_count": 3200000, "total_size_mb": 4800}})
    
    if result:
        print(f"\nEstimated improvement: {result['estimated_improvement_pct']}%")
        print(f"Rewritten query:\n{result['rewritten_query']}")
        print(f"\nExplanation: {result['explanation']}")
        
        # Apply indexes
        apply_optimization(result)

if __name__ == "__main__":
    main()

Real Results: What Happened

For that 18-second query, the optimizer suggested:

  • A partial index on `orders(created_at, status)` where `status IN ('pending', 'processing')`
  • A covering index on `users(id, email, name)` to avoid the extra table lookup

The rewritten query used a `LATERAL JOIN` instead of a standard `JOIN` because the `LIMIT` made it more efficient with a partial index.

Result: 18 seconds → 340ms. That's a 98% improvement.

When This Breaks (And When It Doesn't)

This tool is not a silver bullet. Here's where it fails:

  • Queries with heavy CTEs — GPT-4o often tries to inline CTEs, which is wrong for recursive ones
  • Queries with window functions — It sometimes suggests `DISTINCT ON` when `ROW_NUMBER()` is correct
  • Write-heavy workloads — Index suggestions are read-optimized; they can slow down inserts

Always review the output. Don't blindly run index creation on a production database.

Frequently Asked Questions

Q: Can I use this with MySQL or SQL Server?

A: Yes, but you need to adjust the `EXPLAIN` syntax. MySQL uses `EXPLAIN FORMAT=JSON`. SQL Server uses `SET SHOWPLAN_XML ON`. The prompt template needs minor tweaks for each dialect.

Q: How much does this cost in API tokens?

A: Each optimization call costs roughly 2,000–4,000 tokens. At GPT-4o pricing (~$10/1M input tokens), that's about $0.02–$0.04 per query. Cheap.

Q: Is it safe to run `CREATE INDEX CONCURRENTLY` in production?

A: Yes, but it takes longer and can cause replication lag. I use it because it doesn't lock the table for writes. For very large tables (100GB+), schedule it during off-hours.

Q: What if GPT-4o suggests a wrong index?

A: That's why we include the `risks` field in the response. Always review before applying. For critical systems, I run the suggested index in a staging environment first and measure the impact on write latency.

Related reading: Vietnam Outsourcing: Why Your Next Offshore Development Team Should Be in Hanoi or Ho Chi Minh City

Leave a Comment

Your email address will not be published. Required fields are marked *

Ready to Build with AI-Powered Developers?

Hire Vietnamese engineers augmented by ECOA AI Platform + Claude Code. 5x faster, 40% cheaper.