We Didn’t Just Automate PRs — We Built a 3-Stage Triage Pipeline That Handles 500+ Repos
Let’s be honest. If you maintain more than a handful of open-source repos, you know the pain. PRs pile up. Issues get buried. Contributors get frustrated and leave.
We were there. Managing nearly 500 repos for a client — a large open-source foundation — with a distributed team in Ho Chi Minh City and Can Tho. The manual triage was killing us. A single senior dev was spending 4+ hours a day just sorting through incoming PRs.
Why Your Open Source Project Is Thriving (And 80% of Others Are Dying)
Why Your Open Source Project Is Thriving (And 80% of Others Are Dying) Let’s be real. Most open… ...
So we stopped reacting. We built a 3-stage automated triage pipeline.
It works. Here’s exactly how we did it.
From Monolith to Event Stream: How We Helped a Fintech Startup Migrate 200 APIs in 8 Weeks with a Vietnamese AI-Augmented Team
From Monolith to Event Stream: How We Helped a Fintech Startup Migrate 200 APIs in 8 Weeks with… ...
Stage 1: The Webhook Listener (Don’t Poll, Listen)
The first mistake most teams make? Polling the GitHub API every 5 minutes. That’s wasteful. You’re hammering the API for no reason.
Instead, we set up a lightweight FastAPI webhook receiver. Every `pull_request` event hits our endpoint instantly.
python
# webhook_receiver.py
from fastapi import FastAPI, Request, HTTPException
import hmac, hashlib
app = FastAPI()
SECRET = os.environ["GITHUB_WEBHOOK_SECRET"]
@app.post("/webhook")
async def handle_webhook(request: Request):
body = await request.body()
signature = request.headers.get("x-hub-signature-256", "")
# Verify the signature
expected = "sha256=" + hmac.new(
SECRET.encode(), body, hashlib.sha256
).hexdigest()
if not hmac.compare_digest(signature, expected):
raise HTTPException(403, "Invalid signature")
payload = await request.json()
if payload.get("action") in ["opened", "synchronize"]:
pr_data = extract_pr_data(payload)
# Push to Redis queue for processing
redis_client.lpush("pr_queue", json.dumps(pr_data))
return {"status": "ok"}
Why this matters: We reduced API call volume by 94% compared to polling. The webhook fires the moment a PR is opened — no delays.
Stage 2: The Classification Engine (Don’t Guess, Analyze)
This is where the real work happens. Our Vietnamese team built a classification engine that runs as a GitHub Action. It analyzes each PR across 3 dimensions:
- Code diff size — Is this a typo fix or a 5000-line refactor?
- File types modified — Are they touching core logic or just docs?
- Changed code patterns — Do they match known bug patterns?
The engine uses a lightweight Python script with `git diff` parsing. No AI model needed for the basic stuff.
Related reading: Why Vietnam outsourcing is Beating India at Its Own Game in 2025
Related reading: Outsourcing Software Development Without the Headaches: A CTO’s Playbook for 2025