How We Built a Real-Time Notification System with WebSockets and Redis in 3 Days (And You Can Too)

I’ll be honest: most real-time notification tutorials are a joke. They show a single WebSocket connection with `console.log` and call it done. Then you deploy to production and the whole thing collapses under 100 concurrent users.

We needed something different.

Why Vietnam Outsourcing Is the Smartest Move for Your Tech Stack in 2025

TL;DR: Vietnam outsourcing gives you access to a deep pool of skilled software engineers at 60% lower cost… ...

A client in Singapore had a B2B SaaS platform where users expected instant alerts when their data syncs completed. Their old system used polling every 10 seconds. Users complained. Churn was creeping up. We had to fix it fast.

We shipped a real-time notification system in 3 calendar days using a small team of 3 Vietnamese developers from Can Tho, Node.js, and Redis Pub/Sub. Final notification latency: under 30ms—down from 12 seconds.

Why We Bet the Farm on Vietnam: The Smartest Move to Hire Vietnamese Developers in 2025

TL;DR: Vietnam is outpacing India and the Philippines in developer retention (95%), code quality, and timezone alignment for… ...

Here’s exactly how we did it. You can steal every line.

Why Not Just Socket.IO?

Socket.IO is great for demos. But in production it adds overhead—room management, fallback transports, and a bigger memory footprint. We wanted raw control. We used `ws` (the WebSocket library) with a thin Redis layer for horizontal scaling.

Why Redis? Because WebSocket connections are sticky to a single server instance. If you have multiple nodes, a notification sent to server A won’t reach a user connected to server B. Redis Pub/Sub solves that: any server can publish a message, and all servers receive it and forward to the right WebSocket clients.

Sound simple? It is. But the devil’s in the details.

The Architecture (3 Components Only)

Here’s the high-level flow:


Client (Browser)  <--WebSocket-->  Node.js Server A  <--Redis Pub/Sub-->  Node.js Server B
                                     |                                        |
                                     v                                        v
                                  Redis (message queue + pub/sub)

Node.js servers each hold a Map of userId → WebSocket connections.
Redis Pub/Sub broadcasts messages to all servers.
Redis List acts as a durable queue for unread notifications.

We didn’t need Kafka. Not yet. Redis could handle 50,000 ops/second on a single t3.medium instance. More than enough.

Step 1: Setting Up the WebSocket Server

We used Express + `ws` in the same HTTP server. Here’s the minimal server:

javascript
const express = require('express');
const http = require('http');
const WebSocket = require('ws');
const Redis = require('ioredis');

const app = express();
const server = http.createServer(app);
const wss = new WebSocket.Server({ server });

// In-memory map: userId -> Set of WebSocket connections
const userConnections = new Map();

wss.on('connection', (ws, req) => {
  const userId = extractUserIdFromToken(req.url);  // parse token from query string
  if (!userId) {
    ws.close(4001, 'Unauthorized');
    return;
  }

  // Track the connection
  if (!userConnections.has(userId)) {
    userConnections.set(userId, new Set());
  }
  userConnections.get(userId).add(ws);

  ws.on('close', () => {
    const connections = userConnections.get(userId);
    connections.delete(ws);
    if (connections.size === 0) userConnections.delete(userId);
  });
});

server.listen(3000, () => console.log('Server running on port 3000'));

One pattern we nailed early: authenticate at connection time, not per message. Extracting the JWT from the URL param during `upgrade` prevents a ton of headaches.

Step 2: Adding Redis Pub/Sub for Multi-Server Broadcasting

Now the magic. Each server subscribes to a Redis channel. When a notification needs to be sent, the server publishes to that channel, and also directly sends to its own local connections. This avoids double-send or race conditions.

javascript
const pub = new Redis();
const sub = new Redis();

sub.subscribe('notifications');

sub.on('message', (channel, message) => {
  const { userId, payload } = JSON.parse(message);
  const connections = userConnections.get(userId);
  if (connections) {
    connections.forEach(ws => {
      if (ws.readyState === WebSocket.OPEN) {
        ws.send(JSON.stringify(payload));
      }
    });
  }
});

The publishing side is trivial:

javascript
function sendNotification(userId, payload) {
  // Send to local connections immediately (avoid Redis hop if possible)
  const localConns = userConnections.get(userId);
  if (localConns) {
    localConns.forEach(ws => {
      if (ws.readyState === WebSocket.OPEN) {
        ws.send(JSON.stringify(payload));
      }
    });
  }
  // Publish to Redis for other servers
  pub.publish('notifications', JSON.stringify({ userId, payload }));
}

But here’s a subtle bug: if a user is connected to two servers (unlikely but possible with mobile + desktop), they’ll receive the notification twice. We solved it by adding a deduplication ID in the payload and ignoring duplicates on the client side. Simple, effective.

Step 3: Durable Queue for Offline Users

What if the user is disconnected? You can’t just drop the notification. We push it to a Redis List keyed by userId, then when the user reconnects, we pull all pending messages and send them.

javascript
// Inside ws 'connection' handler, after authentication:
async function deliverPendingNotifications(userId, ws) {
  const pendingKey = `pending:${userId}`;
  let msg;
  while ((msg = await redis.lpop(pendingKey)) !== null) {
    if (ws.readyState === WebSocket.OPEN) {
      ws.send(msg);
    }
  }
}

We set a TTL of 7 days on each pending list to avoid memory leaks. No need for a full database for transient notifications.

Step 4: Handling Backpressure and Scaling

The most common failure? A single user opens 10 tabs, we send 10 duplicate connections, and when a notification fires, we try to `ws.send()` to all of them. If any of those sockets are dead (closed but not cleaned up), it throws an error.

We added a dead socket cleanup on every send attempt:

javascript
function sendToUser(userId, data) {
  const set = userConnections.get(userId);
  if (!set) return;
  const dead = [];
  for (const ws of set) {
    if (ws.readyState === WebSocket.OPEN) {
      ws.send(data);
    } else {
      dead.push(ws);
    }
  }
  dead.forEach(ws => set.delete(ws));
}

We also put a rate limiter on the publishing side: max 10 notifications per second per userId. Beyond that, queue them in a separate Redis sorted set with a decay. That kept our Redis CPU under 20% even at peak.

Step 5: Testing the Latency

We deployed to two `t3.medium` EC2 instances behind an ALB, each running this Node server. A test script fired 1,000 notifications at random userIds and measured the time from publish to client receive.

Average: 28ms
P99: 67ms
Max: 142ms (first cold publish after a Redis reconnection)

Compare that to the old polling system: 12 seconds. Users didn’t notice the difference; they noticed the *absence of waiting*.

Why a Vietnamese Team Made This Possible in 3 Days

I’m not going to pretend we built this alone. My US-based team of two would have taken at least a week—too much context switching and firefighting. We brought in three senior engineers from ECOA AI based in Can Tho. They’d worked with Node.js and Redis on similar projects before. They didn’t need handholding.

One of them, a mid-level dev at $2,000/month, spotted the duplicate delivery bug in Step 2 *before* we hit staging. That saved us a day of debugging.

The total labor cost for this feature: less than $3,000. A comparable onshore US team would run you $15,000–$20,000 for the same three days. Don’t take my word for it—check the numbers yourself.

Putting It All Together

Here’s the `docker-compose.yml` we used locally (production uses AWS ElastiCache):

yaml
version: '3.8'
services:
  app:
    build: .
    ports:
      - "3000:3000"
    environment:
      - REDIS_HOST=redis
    depends_on:
      - redis
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

Run it, open a few browser tabs with the same userId, fire a notification from a script, and watch all tabs update instantly. Feels like magic. It’s just tech.

A Few Hard-Won Lessons

Always validate WebSocket frame sizes. Default is 100 MB. Set `maxPayload` to 10 KB for notifications. One malicious client can crash your server by sending a huge binary frame.
Use `permessage-deflate` with caution. It reduces bandwidth but adds latency. We turned it off.
Don’t store the full message in Redis. Store a reference ID and let the client fetch details via REST. We store only `{ type, title, timestamp }`.

Now, you might be wondering: *Is this overengineering for a small app?* Honestly, if you have less than 500 concurrent users, just use Socket.IO. But if you’re scaling or you need to support multiple server instances, this pattern is dead simple and rock solid.

We’ve since used the same architecture for a live chat system and a collaborative editing feature. It took us about 2 hours to adapt each time. The investment in the base code was absolutely worth it.

Frequently Asked Questions

Does this work with serverless (AWS Lambda)?

Not directly. WebSocket requires persistent connections. You can use API Gateway WebSocket API with Lambda, but the connection management is very different—Lambda is not stateful. Our pattern is designed for long-running servers (ECS, EC2, Kubernetes). If you need serverless, consider using a managed WebSocket service like AWS API Gateway WebSocket + DynamoDB to track connections, but be prepared for higher latency.

How do I handle reconnection after a server crash?

Clients should implement exponential backoff with a maximum delay of 30 seconds. On reconnect, they re-authenticate and the server delivers pending notifications from the Redis list. We also send a ‘last_event_id’ timestamp to avoid sending duplicates.

Can I replace Redis with something else?

Yes. NATS is a great alternative for Pub/Sub. For smaller setups, you could use `cluster` module in Node.js with in-process event emitter, but that only works on a single machine. Redis is the simplest multi-server solution.

What’s the maximum throughput?

We tested up to 50,000 concurrent connections on 4 t3.medium instances with Redis on a single t3.small. CPU on Redis hit 70% with 50,000 publishes/second. For higher throughput, use Redis Cluster or upgrade to a larger instance. For most B2B apps, that’s overkill.