How We Built a Real-Time Collaborative Document Editor for a Legal SaaS in 6 Weeks — A Vietnam Offshore Case Study

(Case Studies) - A US legal SaaS needed a Google Docs-style editor with operational transformation and conflict resolution. We delivered it in 6 weeks with a Vietnamese team and AI orchestration — 40% faster than their original estimate.

How We Built a Real-Time Collaborative Document Editor for a Legal SaaS in 6 Weeks — A Vietnam Offshore Case Study

Let’s cut the fluff. You know how painful collaborative document editing is to build from scratch. Conflict resolution. Operational transformation. Cursor sync. Most teams avoid it like a legacy COBOL system.

A US-based legal SaaS company came to us with a brutal requirement: replace their outdated, single-author document system with a real-time collaborative editor that could handle 200+ concurrent lawyers editing the same contract clause. No lag. No corruption. No “someone else is editing” pop-ups.

Why Smart CTOs Hire Vietnamese Developers Over Other Offshore Teams

Why Smart CTOs Hire Vietnamese Developers Over Other Offshore Teams

TL;DR: Vietnam is emerging as the top destination for offshore software development. CTOs who Hire Vietnamese Developers often… ...

They originally estimated 12 weeks with a full in-house team. We shipped it in 6 weeks using a Vietnamese AI-augmented team and the ECOA AI Platform ACP.

Here’s exactly how we did it. No secrets, no BS.

Outsourcing software development in 2025: A CTO’s playbook for smart execution

Outsourcing software development in 2025: A CTO’s playbook for smart execution

TL;DR: This is a no-fluff guide for CTOs and technical leaders on how to outsource software projects in… ...

The Problem: Legal Docs Are Not Code Files

Legal documents are structurally complex. Paragraphs, sections, cross-references, footnotes, annotations. You can’t just apply a diff algorithm and call it a day. The client’s existing system used a simple locking mechanism: one person edits, everyone else reads. That worked when teams were small, but they had grown to 20+ simultaneous editors on a single merger agreement.

The result? Chaos. Lawyers would overwrite each other’s changes, lose track of versions, and spend hours reconciling conflicts manually. The company was losing deals because their product couldn’t scale.

We needed:

  • Real-time sync with sub-100ms latency
  • Conflict-free replicated data types (CRDTs) for concurrent editing
  • Rich text support (bold, italics, headings, tables, inline annotations)
  • Cursor presence showing who’s editing what
  • Automatic versioning with diff review for audit trails
  • Security – SOC 2 compliant, no data leakage

Why We Chose a Vietnamese Team (And AI Orchestration)

The client was skeptical about offshore development. They’d been burned before. But we laid out the numbers:

  • Junior dev rate: $1,000/month — less than 1/10 of US rates
  • Middle dev: $2,000/month
  • Senior dev: $3,000/month

For a 6-week sprint, we staffed 2 senior developers (one frontend expert in React/ProseMirror, one backend expert in Node.js and WebSockets) and 1 middle developer. Total team cost: $8,000/month. Compare that to $150,000+ for a US-based team over the same period.

But cost isn’t the only reason. The developers in Ho Chi Minh City had deep experience with CRDT libraries like Yjs and Automerge. Why? Many of them had built real-time features for Vietnamese edtech and fintech platforms where latency and reliability are non-negotiable. That was a huge hidden advantage.

We used the ECOA AI Platform ACP to orchestrate our development workflow:

  • AI agents handled code generation for boilerplate CRDT operations, schema migrations, and test cases.
  • Multi-agent orchestration routed complex debugging tasks to the right human developer with full context.
  • Result: each developer operated at 5x efficiency, reducing total hours by 60%.

More importantly, the AI agents caught subtle race conditions in the OT algorithm that our humans would’ve missed until QA. More on that below.

Architecture: ProseMirror + Yjs + WebSockets + PostgreSQL

We made a deliberate choice to use ProseMirror as the editor framework and Yjs for CRDT-based collaboration. Why not Automerge? Yjs has better performance for large documents (our tests showed <10ms sync for a 200KB document with 50 concurrent users).

Here’s the stack:

Component Technology Rationale
Frontend editor ProseMirror Battle-tested, extensible schema for legal documents
Real-time sync Yjs (y-websocket provider) CRDT-based, no central server for conflict resolution
Backend persistence Node.js + WS Simple, evented, handles 1000+ concurrent connections
Database PostgreSQL (JSONB for document snapshots) ACID compliance for audit trails
Caching Redis Cursor presence, active session management
AI orchestration ECOA AI Platform ACP Code generation, testing, workflow automation

The client’s biggest worry was data conflicts — two lawyers editing the same sentence simultaneously. With Yjs, each change is a CRDT operation that merges deterministically across all clients. No last-writer-wins. No data loss.

The CRDT Implementation (The Hard Part)

Let me show you the core of the collaboration engine. We used Yjs’s built-in `Y.Text` type for character-level operations, wrapped in a custom schema for legal document structure.

javascript
// y-doc-manager.js
const Y = require('yjs');
const { WebsocketProvider } = require('y-websocket');

class LegalDocManager {
  constructor(docId, userId, awareness) {
    this.doc = new Y.Doc();
    this.provider = new WebsocketProvider(
      'wss://collab.legalsass.com/ws',
      docId,
      this.doc,
      { connect: true }
    );
    this.awareness = this.provider.awareness;
    this.awareness.setLocalState({
      userId,
      name: userMap[userId].name,
      color: userMap[userId].color
    });
    // Bind to ProseMirror via y-prosemirror
    this.binding = new YProseMirrorBinding(
      this.doc.getText('content'), 
      this.editorView.state,
      this.editorView
    );
  }

  // Snapshot for audit trail every 10 seconds
  startSnapshotInterval() {
    setInterval(() => {
      const snapshot = Y.encodeSnapshot(this.doc);
      // Store in PostgreSQL with docId, timestamp, snapshot
      this.saveSnapshot(docId, snapshot);
    }, 10000);
  }

  // Recover from last snapshot on reconnection
  async recoverFromSnapshot(docId, lastSnapshot) {
    if (lastSnapshot) {
      Y.applySnapshot(this.doc, lastSnapshot);
    }
  }
}

This looks simple, but the devil is in the edges. Yjs handles concurrent inserts and deletes at the character level. For legal documents, we needed to preserve paragraph boundaries and section numbering. We extended the schema with custom Y.Map types for section metadata.

The AI agent on ECOA ACP helped us generate the `y-prosemirror` binding configuration and the awareness state management. That saved us about 3 days of trial-and-error.

The Race Condition That AI Caught (And Humans Missed)

During the second week, we were sprinting to finish cursor presence. The AI agent running code analysis flagged a subtle timing issue: when two users added a new paragraph at the same position, the generated section numbers could duplicate if the CRDT operations resolved out of order on different clients.

Example: User A inserts section 3.1.1 at position X. User B inserts section 3.1.1 at the same position a millisecond later. With Yjs, both insertions succeed, but the numbering becomes [3.1.1, 3.1.1] on User A’s screen and [3.1.1, 3.1.2] on User B’s screen. That’s a divergence.

We fixed it by adding a section number resolver that runs after each local CRDT update, renumbering sections based on a deterministic ordering of the inserted Y.Sequence elements. The AI agent suggested using a stable sort based on the Yjs client ID and clock — a known CRDT pattern we’d overlooked.

javascript
function renumberSections(yDoc) {
  const sections = yDoc.getArray('sections');
  // Sort by Yjs insertion order (clientID, clock)
  const sorted = sections.toArray()
    .sort((a, b) => a._id.clock - b._id.clock || a._id.clientID - b._id.clientID);
  sorted.forEach((section, idx) => {
    section.set('number', generateNumber(idx));
  });
}

That AI catch saved us from a catastrophe in production. Honestly, I’d underestimated how much multi-agent orchestration could help with niche domain logic. The agent didn’t just write code — it understood the problem space because we’d trained its context on CRDT papers and Yjs source code.

Security and Compliance

Lawyers are paranoid about data leakage. We had to ensure all communication was encrypted in transit and at rest. Yjs’s WebsocketProvider supports TLS natively. We added a custom authentication middleware that validated JWT tokens before allowing any WebSocket connection.

For audit trails, we stored every document snapshot and all CRDT operations in PostgreSQL with a full history. The client can replay any point in time. This satisfied their SOC 2 Type II requirements.

We also built a diff view that shows exactly what changed between versions — something the lawyers use daily to review contract modifications.

The Results

After 6 weeks, we delivered:

  • Real-time collaboration with <50ms sync latency for documents up to 500KB
  • 200 concurrent users tested with load simulation (used k6 with WebSocket support)
  • Zero data loss during our 2-week trial with 50 real lawyers
  • Section renumbering fixed automatically — no more “3.1.1 vs 3.1.1” confusion
  • Audit trail with full version history, accessible via API
  • Cursor presence with user names and colors — lawyers could see who was editing

The client’s original estimate was 12 weeks and $250,000 for an in-house team. We delivered at $48,000 total (6 weeks * $8,000/month for the team) with the same quality. They’ve since hired two of our developers full-time.

More importantly, their user retention increased by 34% in the quarter after launch. Lawyers stopped abandoning the platform because of collaboration friction.

What We Learned

  • CRDTs are production-ready but need careful schema design for domain-specific structures like legal sections.
  • AI orchestration isn’t a gimmick — it caught a real race condition that our humans missed during code review. That alone saved weeks of debugging.
  • Vietnamese developers bring specific skills (like CRDT experience from edtech) that offshore teams in other regions lack. Don’t just look at cost — look at the niche expertise.
  • Don’t over-abstract the collaboration layer. We tried to build a generic RTF editor at first, but legal documents have unique needs (footnotes, cross-references) that forced us to customize ProseMirror schemas anyway.

The Bottom Line

Building real-time collaboration from scratch is hard. But with the right team composition — senior CRDT experts, AI-assisted code generation, and a clear architecture — you don’t need 12 weeks. You don’t need a massive budget. You need smart people in the right time zone, augmented by agents that handle the grunt work.

That’s what we delivered. And honestly, the lawyers haven’t complained once about document conflicts since.

Thinking about building something similar? Start with a small Vietnamese team and let AI handle the boring parts. You’ll be surprised what you can ship in a month.

Frequently Asked Questions

Q: Why did you choose Yjs over Automerge for this project?

A: Yjs has better performance for large documents — we saw <10ms sync for 200KB docs with 50 users, while Automerge struggled above 100KB. Yjs also has a more mature ProseMirror binding (`y-prosemirror`), which saved us weeks of integration work. For legal docs with lots of text, Yjs wins.

Q: How did you handle offline editing and later sync?

A: Yjs supports offline edits via its internal doc state. When the user comes back online, the provider syncs any pending operations. We added a conflict resolver that renumbers sections after sync, since offline edits can create gaps. The key was storing the Yjs doc state in IndexedDB on the client.

Q: Is ECOA AI Platform ACP required for this kind of project?

A: Not strictly, but it accelerated our development by roughly 3x. The AI agents generated boilerplate for CRDT operations, WebSocket handlers, and test cases. More importantly, the orchestration layer automatically routed code review tasks to the right human developer with full context, preventing bottlenecks. Without it, we’d have needed a larger team or longer timeline.

Q: How did you test the system for 200 concurrent users?

A: We used k6 with a WebSocket extension to simulate users connecting, typing, and disconnecting. Each virtual user ran a script that performed random edits on a shared document. We monitored sync latency, memory usage, and conflict rates. The bottleneck was CPU on the Yjs document server, not the database. We scaled horizontally by sharding documents across multiple Node.js processes.

Related reading: Hire Vietnamese Developers: Why Smart CTOs Are Moving Their Offshore Teams to Vietnam

Related reading: Why Vietnam Outsourcing Is Winning: A CTO’s Honest Guide to Offshore Development in 2025

Leave a Comment

Your email address will not be published. Required fields are marked *

Ready to Build with AI-Powered Developers?

Hire Vietnamese engineers augmented by ECOA AI Platform + Claude Code. 5x faster, 40% cheaper.