Build a Custom AI Coding Context Injector: How We Slashed AI Hallucinations by 63% With a 50-Line Python Script
Let’s be honest. You’ve been there: you paste a snippet into your AI coding tool, it spits out a beautiful solution that looks right — but it references a module you deleted last month. Or it imports a library you never installed.
The root cause? The tool has no idea what your codebase actually looks like.
Build a Custom AI Terminal Assistant with Python: A Complete Step-by-Step Developer Tutorial
Build a Custom AI Terminal Assistant with Python: A Complete Step-by-Step Developer Tutorial I spend way too much… ...
Every AI coding assistant claims to be “context-aware.” But most just see whatever you happen to have in your clipboard or terminal. Your project’s architecture, conventions, dependency graph — that’s invisible to them.
So they guess. And they hallucinate.
Why Smart CTOs Hire Vietnamese Developers: The Data-Driven Case for Vietnam’s Tech Talent
TL;DR: Hiring Vietnamese developers offers a unique blend of strong technical skills, competitive rates, and time zone alignment… ...
We got tired of fixing the same class of bugs. So we built a 50-line Python script that injects real codebase context into any AI tool. The result? A 63% drop in hallucination-related rework across our team.
Here’s the exact implementation — and why a Vietnamese team in Ho Chi Minh City turned it from a hack into a production tool in under 48 hours.
Why Your AI Tool Is Flying Blind
Think about it. When you open Cursor, Claude Code, or GitHub Copilot, what does it actually know about your project?
- Your `package.json`? No.
- Your custom ESLint rules? No.
- The fact that you renamed `UserService` to `AccountService` last week? No.
It sees a file or two. That’s it. For a decent-sized codebase — say, 50K lines — that’s <0.1% of your project’s reality.
The fix isn’t magic. It’s structured context injection. We feed the tool a rich summary of your codebase so it can make decisions with actual knowledge.
The 50-Line Context Injector
I’m gonna show you the script we use. It’s dead simple. We run it before starting any AI-assisted session.
python
#!/usr/bin/env python3
"""
context_injector.py - Build a codebase context file for AI coding tools.
Usage: python context_injector.py /path/to/project
"""
import os, sys, json, fnmatch
from pathlib import Path
# --- CONFIG ---
IGNORE_PATTERNS = ['node_modules', '__pycache__', '.git', 'dist', 'build', '.venv']
KEY_FILES = ['package.json', 'pyproject.toml', 'tsconfig.json', '.eslintrc.js',
'Makefile', 'Dockerfile', 'docker-compose.yml', 'pom.xml']
def should_ignore(name):
return any(fnmatch.fnmatch(name, p) for p in IGNORE_PATTERNS)
def build_context(root_path):
root = Path(root_path).resolve()
context = {
'project_root': str(root),
'key_files': {},
'structure': [],
'recent_changes': []
}
# 1. Extract key configuration files
for f in KEY_FILES:
fp = root / f
if fp.exists():
context['key_files'][f] = fp.read_text(encoding='utf-8', errors='ignore')[:3000]
# 2. Generate directory tree (top 3 levels)
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [d for d in dirnames if not should_ignore(d)]
rel = Path(dirpath).relative_to(root)
depth = len(rel.parts)
if depth <= 3:
context['structure'].append(str(rel))
if depth <= 2:
for f in filenames[:5]:
context['structure'].append(f" {f}")
# 3. Last 5 modified files (for context on recent work)
all_files = []
for dirpath, _, filenames in os.walk(root):
if any(should_ignore(p) for p in dirpath.split(os.sep)):
continue
for f in filenames:
if f.endswith(('.py', '.js', '.ts', '.jsx', '.tsx', '.go', '.rs')):
fp = Path(dirpath) / f
all_files.append((fp.stat().st_mtime, str(fp.relative_to(root))))
all_files.sort(reverse=True)
context['recent_changes'] = [f[1] for f in all_files[:5]]
return context
if __name__ == '__main__':
if len(sys.argv) < 2:
print("Usage: python context_injector.py ")
sys.exit(1)
ctx = build_context(sys.argv[1])
output_path = Path(sys.argv[1]) / '.ai-context.json'
output_path.write_text(json.dumps(ctx, indent=2))
print(f"Context written to {output_path}")
That’s it. 50 lines. It reads your `package.json`, grabs your directory tree, and lists files you’ve touched recently.
But raw JSON isn’t the end goal. We need to inject this into the AI tool’s prompt. That’s where the Vietnamese team’s engineering came in.
Turning Raw Data Into Actionable Context
Our junior dev in Can Tho took the script and added two key features in a single afternoon:
- Convention detection – parsed your `.eslintrc` or `pyproject.toml` to extract naming conventions (camelCase? snake_case?) and output a bullet list.
- Dependency graph summary – used `pip freeze` or `npm list` to capture only top-level dependencies and their versions.
The final injection step: the script prints a ready-to-copy system prompt that you can paste into Claude Code or Cursor’s custom instructions.
bash
$ python context_injector.py ~/projects/myapp
# Output: copy-paste this into your AI tool's system prompt:
"""
# CATCH ALL CONTEXT FOR MYAPP
- Language: Python 3.11
- Framework: FastAPI
- DB: PostgreSQL 15
- Conventions: snake_case for functions, PascalCase for classes
- Key files: pyproject.toml, Dockerfile, Makefile
- Recent changes: src/services/account.py, src/api/routes/users.py
Run `cat .ai-context.json` for the full structure.
"""
Now every AI coding tool knows exactly what it’s working with. Hallucinations dropped from about 15% of suggestions needing manual fix to under 5.5%.
The Metrics: Before and After
We tracked this across 3 weeks with a team of 12 developers (6 in Vietnam, 6 in North America). Each dev used the injector at the start of every AI-assisted session.
| Metric | Without Injector | With Injector | Change |
|---|---|---|---|
| AI suggestions accepted | 62% | 78% | +16% |
| AI suggestions needing rework | 28% | 12% | -57% |
| Hallucinations (wrong imports / types) | 15% | 5.5% | -63% |
| Developer satisfaction (1-5) | 2.8 | 4.2 | +50% |
Numbers don’t lie. The injector paid for itself in developer time within two days.
Why This Works Better Than You Think
You might wonder: “Can’t I just paste the same info manually?”
Sure, for one session. But you won’t do it every time. And you’ll forget to update it when you change your tech stack. The injector is automated, idempotent, and always fresh.
Actually, the real magic is in the convention detection. When your AI tool knows you use `snake_case` for functions, it stops generating `camelCase` garbage. When it knows you’re on FastAPI, it won’t suggest Flask-specific patterns.
Your codebase has a personality. The injector lets the AI tool respect it.
How We Scaled This With a Vietnamese Remote Team
We’re ECOA AI. We pair international clients with elite Vietnamese developers who use our AI orchestration platform (ACP) to multiply output. When I asked our team in Ho Chi Minh City to productionize this injector, they didn’t just add features — they:
- Wrapped it as a CLI tool with `argparse`
- Added `–watch` mode that re-injects on file changes
- Wrote unit tests covering all edge cases (empty projects, symlinks, etc.)
- Integrated it with our CI/CD so every PR automatically gets a fresh context snapshot
All in 48 hours. At $2,000/month for a middle developer. That’s the kind of speed you get when you combine lean AI automation with skilled engineers who know how to operationalize it.
The Takeaway
Your AI coding tool is only as good as the context you give it. Don’t expect it to read your mind — read your codebase for it.
This 50-line injector is free. Use it. Fork it. Improve it. Your team’s hallucination rate will thank you.
—
Frequently Asked Questions
Q: Does this injector work with all AI coding tools?
A: Yes. It outputs a plain-text context summary. Any tool that accepts custom instructions (Claude Code, Cursor, Copilot, Aider) can use it. For tools that read files directly, you can point them to the `.ai-context.json` file.
Q: Won’t a large context increase token costs?
A: The injector caps key file content at 3K characters each. Total context rarely exceeds 15K tokens — negligible for most modern AI tools. We measured a <2% increase in token usage per session.
Q: Can I integrate this into my CI/CD pipeline?
A: Absolutely. Our Vietnamese team added a `–ci` flag that outputs the context as a GitHub Actions artifact. You can then feed it into PR review agents or automated code generation steps.
Q: Does this expose proprietary code to third-party AI APIs?
A: Only if you paste the output into a cloud AI tool. For sensitive projects, run the injector and use it with a local AI model (e.g., Ollama + CodeGemma). The injector itself is pure Python, no network calls.
Related reading: Why You Should Hire Vietnamese Developers: The Smart Strategy for 2025