How We Migrated a 500K-Line Monolith to Microservices in 8 Weeks with a Vietnamese Team
Let me be blunt: monolith-to-microservices migrations are usually a disaster.
I’ve seen teams spend 18 months and $2M+ only to roll back. The complexity kills you. The coordination overhead eats your budget. And somewhere around month six, everyone starts questioning their life choices.
How AI is Reshaping the Software Development Lifecycle (And Why You Should Care)
TL;DR: AI coding tools are transforming the quy trình phát triển phần mềm bằng AI, cutting development time… ...
But last year, we did something different. A US-based SaaS client came to us with a 500,000-line Ruby on Rails monolith that was buckling under its own weight. Deployments took 45 minutes. A single broken test could take down the entire platform. Their CTO told me, “We’re spending 60% of our engineering time just keeping the lights on.”
We migrated the entire thing to a 12-service microservices architecture in 8 weeks.
How We Cut Our CI/CD Pipeline Setup Time by 60% Using GitHub Actions (Real Lessons)
TL;DR: This guide walks you through building a production-grade CI/CD pipeline with GitHub Actions. You’ll learn real-world patterns… ...
Here’s the playbook.
The Problem: A Monolith That Was Eating the Company Alive
The client’s platform handled B2B payment processing. Think Stripe, but for enterprise supply chains. The codebase had grown organically over 6 years. By the time they called us, it had:
- 500,000+ lines of Ruby (Rails 5.2, still on Ruby 2.6)
- 47 database migrations that had never been cleaned up
- A single PostgreSQL database with 230+ tables
- Deploy cycle: 45 minutes, twice a week, with a 30% chance of rollback
- Test suite: 8,000+ tests taking 90 minutes to run
The breaking point came when a junior dev accidentally pushed a migration that locked the `invoices` table for 12 minutes during peak hours. They lost $40,000 in transaction volume that day.
They needed out. Fast.
Why We Chose Vietnam (And Not India or Eastern Europe)
The client had tried outsourcing before. They had a team in India that delivered code, but the communication lag was brutal. “We’d explain a requirement on Monday, get something completely different on Thursday, and spend Friday fixing it,” their VP of Engineering told me.
We proposed a different model: a dedicated team of 8 Vietnamese engineers from our Ho Chi Minh City hub, all working US timezone hours (9 PM to 6 AM Vietnam time). Every single one was vetted for:
- English fluency (C1 level minimum)
- 5+ years of Ruby/Rails experience
- Prior microservices migration experience (non-negotiable)
- Familiarity with our ECOA AI Platform ACP for workflow orchestration
Why Vietnam? Three reasons:
- Timezone overlap with US West Coast: 14 hours difference means we could hand off work at 6 PM PST and wake up to completed PRs.
- Engineering rigor: Vietnamese CS programs emphasize fundamentals. These engineers didn’t just know Rails—they understood distributed systems, database internals, and API design.
- Cost efficiency: At $2,000/month per middle developer, the entire 8-person team cost less than 3 senior engineers in San Francisco.
The Strategy: Strangler Fig + AI-Assisted Refactoring
We didn’t try to rewrite the monolith from scratch. That’s suicide. Instead, we used the Strangler Fig pattern—gradually extracting services while keeping the monolith running.
Here’s the high-level plan:
| Week | Focus | Key Deliverables |
|---|---|---|
| 1-2 | Discovery & domain mapping | Bounded context map, dependency graph, API contracts |
| 3-4 | Extract auth & user service | Auth service live, monolith routes to new service |
| 5-6 | Extract payment processing | Payment service with event-driven architecture |
| 7-8 | Extract reporting & notifications | Reporting service, notification service, cutover |
But here’s the secret weapon: we used AI agents to automate the grunt work.
How ECOA AI Platform ACP Accelerated the Migration
Our team used the ECOA AI Platform ACP (Agent Coordination Platform) to create specialized AI agents that handled repetitive migration tasks:
- Code Analyzer Agent: Scanned the monolith’s codebase and generated a dependency graph. Found 14 circular dependencies that would have broken the migration.
- Test Migration Agent: Automatically rewrote RSpec tests to work with the new service boundaries. Handled 3,200 tests in 4 days.
- API Gateway Config Agent: Generated Kong gateway configurations for routing traffic between the monolith and new services.
Honestly, without these agents, we would have needed 12-14 engineers and 16 weeks. The AI agents handled about 40% of the boilerplate work.
Week 1-2: The Discovery Phase That Saved Us
Most migration failures happen because teams don’t understand their own codebase. We spent the first two weeks doing nothing but analysis.
What we found surprised everyone:
- 23% of the codebase was dead code (controllers, views, and models that hadn’t been touched in 2+ years)
- The monolith had 6 different payment processing flows, but only 2 were actually used in production
- There was a “secret” background job that ran every 15 minutes and recalculated invoice totals—nobody on the current team knew it existed
We used the Code Analyzer Agent to generate a bounded context map that showed exactly which parts of the monolith belonged to which domain. This became our migration blueprint.
Key lesson: Never start a migration without a complete dependency graph. You will miss something, and it will break in production at 2 AM on a Saturday.
Week 3-4: Extracting the Auth Service
We started with authentication because it had the fewest dependencies. The monolith used Devise with a custom JWT implementation. We extracted this into a standalone Go service (yes, Go—not Rails).
Why Go? The auth service needed to handle 5,000+ requests per second during peak hours. Rails couldn’t do that without significant caching infrastructure. Go handled it with 2 replicas and 256MB RAM each.
The migration flow:
- Deploy the new auth service alongside the monolith
- Configure Kong to route `/auth/*` requests to the new service
- Run both systems in parallel for 48 hours
- Compare response times and error rates
- Cut over traffic completely
The parallel run caught 3 edge cases where the new service handled tokens differently. We fixed them before they hit production.
Week 5-6: The Payment Processing Nightmare
This was the hard part. The monolith’s payment processing logic was spread across 14 different models, 6 controllers, and 4 background job classes. It was a mess.
We used the Event-Driven Architecture pattern here. Instead of trying to replicate the monolith’s synchronous flow, we designed an event bus using RabbitMQ:
ruby
# Example: Payment processed event
class PaymentProcessedEvent
include EventBus::Event
attributes :payment_id, :amount, :currency, :timestamp
topic 'payments.processed'
end
The new payment service would emit events. Other services (invoicing, notifications, reporting) would subscribe to those events. This decoupled everything beautifully.
But we hit a wall: The monolith had a synchronous callback chain that updated 5 different tables in a single transaction. Breaking that into events meant we lost atomicity.
The solution? We used a Saga pattern with compensating transactions:
ruby
class PaymentSaga
def execute(payment)
PaymentService.charge(payment)
InvoiceService.mark_as_paid(payment.invoice_id)
NotificationService.send_receipt(payment.user_id, payment.amount)
rescue PaymentFailedError => e
InvoiceService.revert_paid(payment.invoice_id)
raise
end
end
This added complexity, but it also made the system more resilient. If the notification service was down, the payment still went through. The notification would be retried later.
Week 7-8: Reporting and the Final Cutover
The reporting service was surprisingly easy. The monolith generated reports using raw SQL queries that took 30+ seconds to run. We moved this to a dedicated read replica with materialized views. Reports now load in under 2 seconds.
The final cutover happened on a Thursday night. We:
- Stopped all traffic to the monolith
- Ran a data consistency check (all 230 tables matched)
- Pointed DNS to the new API gateway
- Monitored error rates for 4 hours
Result: Zero downtime. Zero data loss. The client’s team didn’t even notice the switch.
The Numbers That Matter
After 8 weeks, here’s what we delivered:
| Metric | Before | After | Improvement |
|---|---|---|---|
| Deployment time | 45 minutes | 3 minutes | 93% faster |
| Test suite runtime | 90 minutes | 12 minutes | 86% faster |
| Requests per second | 1,200 | 8,500 | 7x improvement |
| Monthly infrastructure cost | $18,000 | $9,200 | 49% reduction |
| Developer productivity (story points/week) | 24 | 68 | 2.8x increase |
The client’s CTO sent me a Slack message at 2 AM after the cutover: “I just deployed a hotfix in 4 minutes. I almost cried.”
What We Learned (The Hard Way)
1. Don’t trust your code coverage numbers. The monolith had 87% code coverage, but the tests were mostly integration tests that tested everything together. When we extracted services, those tests broke because they depended on the monolith’s global state. We had to rewrite 40% of the test suite.
2. AI agents are great for boilerplate, terrible for architecture decisions. Our Test Migration Agent could rewrite tests, but it couldn’t decide which service a test belonged to. That required human judgment.
3. Vietnamese engineers are underrated. The team we assembled in Ho Chi Minh City was world-class. They didn’t just execute—they contributed to the architecture. One of them suggested using RabbitMQ over Kafka because the client’s throughput didn’t warrant Kafka’s complexity. He was right.
4. Communication is the real bottleneck. We used a daily 15-minute standup at 9 AM PST (10 PM Vietnam time). Every engineer had to explain what they did in English. The first week was rough. By week 4, they were debating API design patterns like native speakers.
Why This Model Works
The client got a world-class migration team for a fraction of the cost of hiring locally. They paid:
- 8 Vietnamese engineers: $16,000/month total
- ECOA AI Platform ACP licensing: $3,000/month
- Project management (US-based): $8,000/month
Total: $27,000/month for a team that delivered in 8 weeks what most US agencies would quote 6 months and $200,000+ for.
More importantly, they kept the team. After the migration, 6 of the 8 engineers stayed on as their dedicated development team. They’re now building new features on the microservices architecture.
Frequently Asked Questions
Q: How did you handle data consistency during the migration?
We ran both the monolith and new services in parallel for 48-72 hours per service. A data comparison script ran every 15 minutes, comparing records in the monolith’s database against the new service’s database. Any discrepancies triggered an alert and automatic rollback of that service’s traffic.
Q: What was the biggest technical challenge?
Breaking the monolith’s transactional integrity. The old code assumed that if one database write failed, everything rolled back. With microservices, we had to implement sagas and compensating transactions. This added about 2 weeks to the timeline.
Q: Did you use any specific tools for the migration?
Yes. We used the ECOA AI Platform ACP for code analysis and test migration. For the actual infrastructure, we used Kong as the API gateway, RabbitMQ for event streaming, and Kubernetes (EKS) for deployment. The Go auth service was deployed on AWS Fargate.
Q: How did you ensure the Vietnamese team understood the business domain?
We spent the first week doing domain modeling workshops. Every engineer was assigned a specific business domain (payments, invoicing, users, etc.) and became the “expert” for that domain. They also had direct Slack access to the client’s product managers. No intermediaries.
Related: outsource software development — Learn more about how ECOA AI can help your team.
Related: software development outsourcing — Learn more about how ECOA AI can help your team.
Related: software outsourcing — Learn more about how ECOA AI can help your team.
Related: software outsourcing services — Learn more about how ECOA AI can help your team.
Related reading: Why Smart CTOs Choose to Hire Vietnamese Developers (And You Should Too)