How We Built a Real-Time Data Enrichment Pipeline for a Martech Startup in 3 Weeks — A Vietnam Offshore Case Study

Let me be blunt: most “real-time” pipelines are lies.

They’re batch jobs running on a cron schedule, pretending to be real-time. I’ve seen it a hundred times.

Generative Engine Optimization (GEO): How to Optimize Your Brand for AI Search Engines

Generative Engine Optimization (GEO) is reshaping how brands get discovered in the AI era. This guide explains how… ...

But this time was different. A Martech startup in San Francisco came to us with a real problem. They had 50 million customer records sitting in a mix of PostgreSQL, MongoDB, and S3 buckets. Every time a user visited their site, they needed to enrich that profile with firmographic data, social signals, and behavioral scores.

The old way? A nightly batch job that took 6 hours to complete.

Hire Vietnamese Developers: The Smart Strategy for Scaling Tech Teams in 2025

Educational rigor — Vietnam consistently ranks in the top 5 of the International Math Olympiad. The curriculum emphasizes… ...

By the time the data was fresh, it was already stale.

Here’s the exact architecture we built with a team of 4 senior Vietnamese engineers from our Ho Chi Minh City hub. We cut that 6-hour window to 18 minutes. No exaggeration.

The Problem: Stale Data Was Costing Them Revenue

The startup was losing deals because their sales team was looking at 6-hour-old data.

A prospect would visit their pricing page. The enrichment engine would fire. But the data would arrive 6 hours later, long after the prospect had moved on.

The numbers were brutal:

34% of enrichment jobs failed silently due to API rate limits
60% of their S3 data was orphaned (no one knew it existed)
Average enrichment latency: 6 hours

They needed real-time. Not “near real-time.” Real-time.

Why We Chose a Vietnamese Team for This

Look, I could have hired locally in SF for $200k/year per engineer. But that’s not how we work.

We hired from Can Tho and Ho Chi Minh City. Why?

Because Vietnamese engineers don’t just write code. They *optimize*. They’ve been through the wringer with legacy systems. They know what production looks like.

More importantly, they’re cost-effective. Our senior developers cost $3,000/month. That’s a fraction of what you’d pay in the US. And they’re vetted through the ECOA AI Platform ACP, which gives them a 5x efficiency boost.

But enough about the team. Let’s talk about the pipeline.

The Architecture: Kafka + Flink + Redis + PostgreSQL

Here’s what we built:


User Event -> Kafka -> Flink Stream Processor -> Redis Cache -> PostgreSQL (enriched) -> API Response

Wait, that’s too simple. Let me break it down.

We used:

Kafka as the event backbone (handles 10,000 events/second easily)
Flink for stream processing (because it’s stateful and handles exactly-once semantics)
Redis for caching (we cached firmographic lookups to avoid hitting APIs)
PostgreSQL as the source of truth

The key insight? We didn’t enrich everything.

We enriched *what mattered*.

The Enrichment Logic

python
def enrich_event(event):
    # Check cache first
    cache_key = f"enrich:{event['domain']}"
    cached = redis.get(cache_key)
    if cached:
        return cached
    
    # If not cached, hit the API
    data = fetch_firmographic_data(event['domain'])
    if data.get('status') == 'rate_limited':
        # Back off and retry
        time.sleep(2)
        data = fetch_firmographic_data(event['domain'])
    
    # Cache for 1 hour
    redis.setex(cache_key, 3600, json.dumps(data))
    
    return data

Simple, right? But it works.

The cache hit rate was 78%. That means 78% of our enrichment requests never hit the external API. We saved thousands of dollars in API costs.

The Results: 18 Minutes vs 6 Hours

Here’s what we tracked:

Metric	Before	After
Processing time	6 hours	18 minutes
API failure rate	34%	2%
Cache hit rate	0%	78%
Cost per month	$12,000	$3,500

We cut costs by 70%. That’s not a typo.

The Hard Part: Exactly-Once Semantics

The hardest part was ensuring exactly-once processing. Flink handles this natively, but you need to configure it right.

Here’s the config that saved us:

yaml
pipeline:
  checkpointing:
    interval: 60000  # 1 minute
    mode: EXACTLY_ONCE
  state:
    backend: rocksdb
    ttl: 3600000  # 1 hour

Without this, we’d have duplicate events. And duplicate events mean bad data.

What We Learned

Three things:

Cache everything you can. We cached firmographic data, social signals, and behavioral scores. The cache hit rate saved us.

Don’t over-engineer. We started with a simple Kafka -> Flink pipeline. It worked. We added complexity only when needed.

Vietnamese engineers are fast. Our team in Ho Chi Minh City built this in 3 weeks. A US-based team would have taken 6.

The Real Secret: ECOA AI Platform ACP

Our developers use the ECOA AI Platform ACP. It’s an AI agent orchestration platform that automates 80% of the boilerplate.

Here’s how it works:

The ACP generates the Kafka producer code
It handles the Flink stream processing logic
It even writes the Redis caching layer

Our developers just review the code. They don’t write it from scratch.

That’s how we achieved 5x efficiency.

Frequently Asked Questions

Q: How do you handle data consistency in a real-time pipeline?

A: Use Flink’s exactly-once semantics with checkpointing. We configured checkpointing every 60 seconds. If a node fails, Flink restarts from the last checkpoint. No data loss.

Q: What’s the biggest mistake companies make with real-time enrichment?

A: Not caching. Most companies hit the API for every single event. That’s expensive and slow. Cache firmographic data for 1 hour. You’ll save 70% of API costs.

Q: How do you scale this pipeline?

A: Add more Kafka partitions. Flink scales horizontally. We started with 3 partitions. Now we have 12. Each partition handles 1,000 events/second.

Q: Why hire Vietnamese developers for this?

A: Cost. A senior Flink developer in the US costs $200k/year. In Vietnam, it’s $36k/year. And they’re just as good. We’ve been doing this for 5 years. No quality issues.