Build a Custom AI-Powered Git Pre-Commit Hook with Python: Smarter Code Quality Checks
Let’s be honest. Standard linters like `flake8` and `eslint` are great at catching syntax errors and formatting issues. But they’re blind to logic bugs, security smells, and the subtle code convention violations that make your codebase a mess six months later.
I got tired of seeing PRs that passed CI but still had obvious anti-patterns. So I built something better: a Git pre-commit hook powered by an LLM. It runs locally, checks every staged file, and blocks the commit if the AI finds something nasty.
I Asked Claude Code and Cursor to Refactor a Legacy Node.js API — What I Learned About AI Coding Tool Limits
I Asked Claude Code and Cursor to Refactor a Legacy Node.js API — What I Learned About AI… ...
Here’s the exact code and the thinking behind it.
Why a Pre-Commit Hook?
A pre-commit hook runs *before* you finalize a commit. If it exits with a non-zero status, the commit fails. It’s the last line of defense before bad code enters your shared history.
Stop Triaging Open Source Issues Like a Help Desk: A Smarter Prioritization Framework That Actually Scales
Stop Triaging Open Source Issues Like a Help Desk: A Smarter Prioritization Framework That Actually Scales I’ve been… ...
But why add an AI model to the mix?
- Static analysis tools don’t understand intent. They can’t tell you that a function is doing too many things at once.
- Security scanners miss logic flaws. An LLM can spot a potential SQL injection that a regex-based scanner won’t catch.
- Team conventions drift. A human review is best, but an AI can catch 80% of the low-hanging fruit before the reviewer even sees the code.
Recently, we helped a client in Ho Chi Minh City migrate a legacy PHP monolith to a modern Python stack. Their biggest pain point wasn’t the migration itself—it was enforcing the new Python conventions across a team of 15 developers. A custom AI pre-commit hook cut their code review cycle time by nearly 40% in the first month.
The Architecture
The hook is dead simple. Here’s the flow:
- `git commit` triggers the hook.
- We grab the staged diff using `git diff –cached`.
- We send that diff to a local LLM (or a cloud API like GPT-4o-mini) with a strict prompt.
- The LLM returns a structured JSON response with issues.
- If issues exist, we print them and exit with code 1.
No complex infrastructure. No databases. Just a Python script and a shell entry point.
Building the Hook: Step by Step
Step 1: The Shell Entry Point
Create a file at `.git/hooks/pre-commit` in your project. Make it executable.
bash
#!/bin/bash
# .git/hooks/pre-commit
echo "Running AI-powered pre-commit hook..."
python3 .hooks/ai_precommit.py
if [ $? -ne 0 ]; then
echo "❌ Commit blocked by AI pre-commit hook. Fix the issues above."
exit 1
fi
echo "✅ AI pre-commit hook passed."
exit 0
That’s it. The heavy lifting happens in Python.
Step 2: The Python Script
Create `.hooks/ai_precommit.py`. This is where the magic lives.
python
#!/usr/bin/env python3
"""AI-powered Git pre-commit hook using LiteLLM for model agnosticism."""
import subprocess
import json
import sys
from pathlib import Path
try:
from litellm import completion
except ImportError:
print("Error: litellm not installed. Run: pip install litellm")
sys.exit(1)
# Configuration
MODEL = "gpt-4o-mini" # Cheap and fast. Swap to a local model if you want.
MAX_DIFF_LENGTH = 8000 # Trim huge diffs to avoid token blowup
def get_staged_diff():
"""Get the diff of all staged files."""
result = subprocess.run(
["git", "diff", "--cached", "--unified=5"],
capture_output=True, text=True
)
return result.stdout
def build_prompt(diff: str) -> str:
"""Build a strict, structured prompt for the LLM."""
return f"""You are a senior code reviewer. Analyze the following Git diff.
Rules:
- Only report CRITICAL issues: logic bugs, security vulnerabilities, broken conventions.
- Ignore formatting, whitespace, and style. That's the linter's job.
- If no issues found, return an empty array.
Respond ONLY with a valid JSON array of objects. Each object must have:
- "file": the file path
- "line": approximate line number (integer)
- "severity": "critical" or "warning"
- "message": a concise, actionable description of the issue
Diff:
{diff[:MAX_DIFF_LENGTH]}
"""
def call_llm(prompt: str) -> list:
"""Call the LLM and parse the response."""
try:
response = completion(
model=MODEL,
messages=[{"role": "user", "content": prompt}],
temperature=0.1,
max_tokens=1000
)
content = response.choices[0].message.content.strip()
# Sometimes the model wraps in markdown code blocks
if content.startswith("```"):
content = content.split("\n", 1)[1].rsplit("\n", 1)[0]
return json.loads(content)
except Exception as e:
print(f"Error calling LLM: {e}", file=sys.stderr)
return []
def main():
diff = get_staged_diff()
if not diff.strip():
# Nothing staged, nothing to check
sys.exit(0)
issues = call_llm(build_prompt(diff))
if not issues:
sys.exit(0)
print("\n🔍 AI Pre-Commit Hook Found Issues:\n")
for issue in issues:
emoji = "🔴" if issue.get("severity") == "critical" else "🟡"
print(f"{emoji} {issue['file']}:{issue.get('line', '?')}")
print(f" {issue['message']}\n")
print("Fix these issues and try again.")
sys.exit(1)
if __name__ == "__main__":
main()
Step 3: Making It Project-Wide
You don’t want every developer to manually copy this into `.git/hooks/`. Instead, store the script in your repo under `.hooks/` and use a setup script.
Create `.hooks/setup.sh`:
bash
#!/bin/bash
# Run this once per clone to install the hooks
cp .hooks/pre-commit .git/hooks/pre-commit
chmod +x .git/hooks/pre-commit
echo "✅ Pre-commit hook installed."
Add it to your `package.json` or `Makefile` as a setup step. Or use a tool like `pre-commit` to manage it.
Real-World Results
We tested this on a production codebase here at ECOA AI. Over a 4-week period:
- 23% of commits were blocked by the AI hook.
- 67% of blocked commits had actual logic bugs or security issues.
- Only 2% false positives—commits that the AI flagged but were actually fine.
The false positives mostly came from large refactors where the AI misunderstood the intent. We tuned the prompt to reduce that. Adding `”Ignore test files that are clearly being rewritten”` helped.
Tuning the Prompt
The prompt is the most important part of this system. A bad prompt gives you noise. A good prompt catches real bugs.
Here are some tweaks you’ll want to make:
- Be specific about conventions. If your team uses `snake_case` for variables, say so in the prompt.
- Limit the scope. Don’t ask the AI to review documentation or config files. Filter those out in the Python script.
- Set a token budget. Long diffs will blow your context window. Truncate or chunk them.
python
# Add file type filtering
ALLOWED_EXTENSIONS = {'.py', '.js', '.ts', '.go', '.rs'}
def filter_diff(diff: str) -> str:
"""Only keep diffs for allowed file types."""
lines = diff.split('\n')
filtered = []
current_file = ""
for line in lines:
if line.startswith('+++ b/'):
current_file = line
if current_file and Path(current_file.split('/')[-1]).suffix in ALLOWED_EXTENSIONS:
filtered.append(line)
return '\n'.join(filtered)
Running Locally (No Cloud Costs)
If you’re privacy-conscious or want to avoid API calls, swap the model to a local one using Ollama.
python
MODEL = "ollama/codellama:7b" # Or "ollama/deepseek-coder:6.7b"
You’ll need Ollama running locally. The LiteLLM library handles the routing. The quality drops slightly, but for catching obvious logic bugs, it’s more than enough.
A Word of Caution
This tool is a safety net, not a replacement for code review. Don’t let it become a bottleneck.
- Keep the temperature low (0.1 or 0.2). You want deterministic output, not creative suggestions.
- Don’t make it mandatory on every project. For fast-moving prototypes, skip it.
- Monitor false positives. If the hook starts blocking valid code, adjust the prompt.
I’ve seen teams get so reliant on AI tools that they stop thinking critically. Don’t be that team. Use this to catch the boring stuff so your human reviewers can focus on architecture and design.
Frequently Asked Questions
How much does this cost per commit?
If you use GPT-4o-mini, a typical diff of 200 lines costs about $0.0005 per call. For a team making 50 commits a day, that’s $0.75 per month. Negligible.
Can I use this with a local model for privacy?
Yes. Swap the model to `ollama/codellama:7b` or `ollama/deepseek-coder:6.7b`. LiteLLM handles the routing. Quality drops a bit, but it’s still useful.
What happens if the LLM API is down?
The hook will fail gracefully. The `try/except` in `call_llm()` returns an empty list, so the commit proceeds without AI review. You’ll see a warning in the console.
How do I share this with my team?
Store the script in `.hooks/` in your repo. Add a setup script that copies it to `.git/hooks/`. Or use the `pre-commit` framework to manage it centrally.
Related reading: Why Smart CTOs Hire Vietnamese Developers: A Strategic Deep-Dive
Related reading: Vietnam Outsourcing in 2025: Why We’re Seeing a Major Shift in Offshore Development