<--- Back to all resources

Architecture & Patterns

May 22, 2025

12 min read

Streaming API Design Patterns: SSE, WebSockets, gRPC, and Webhooks

How to expose streaming data to downstream applications using SSE, WebSockets, gRPC streaming, webhooks, and long polling — with code examples and trade-offs.

You have built a streaming pipeline. Data flows from your databases through CDC, gets processed, and lands in your destinations. Now an application team asks: “How do we consume this stream in our app?”

This is the last-mile problem of streaming architecture. The internal pipeline is running, but applications — dashboards, mobile apps, microservices — need a way to receive updates in real time. The API pattern you choose determines latency, scalability, complexity, and how well your system handles failures.

This guide covers the five main patterns for exposing streaming data to downstream consumers, with code examples and guidance on when to use each.

Pattern 1: Server-Sent Events (SSE)

SSE is a one-way streaming protocol built on HTTP. The server sends a stream of events to the client over a long-lived HTTP connection. The client cannot send data back on the same connection.

How It Works

The client makes a regular HTTP GET request with Accept: text/event-stream. The server responds with Content-Type: text/event-stream and keeps the connection open, writing events as they occur.

Client                          Server
  │                                │
  │── GET /events (Accept: text/   │
  │   event-stream) ──────────>    │
  │                                │
  │<── 200 OK                      │
  │    Content-Type: text/         │
  │    event-stream                │
  │                                │
  │<── data: {"user": "alice"}     │
  │                                │
  │<── data: {"user": "bob"}       │
  │                                │
  │    (connection stays open)     │

Server Implementation (Node.js)

const express = require('express');
const app = express();

// In-memory client tracking
const clients = new Set();

app.get('/events', (req, res) => {
  // SSE headers
  res.writeHead(200, {
    'Content-Type': 'text/event-stream',
    'Cache-Control': 'no-cache',
    'Connection': 'keep-alive',
    'X-Accel-Buffering': 'no',  // Disable nginx buffering
  });

  // Send initial connection event
  res.write(`event: connected\ndata: ${JSON.stringify({ status: 'ok' })}\n\n`);

  clients.add(res);

  // Handle client disconnect
  req.on('close', () => {
    clients.delete(res);
  });
});

// Broadcast an event to all connected clients
function broadcast(eventType, data) {
  const payload = `event: ${eventType}\nid: ${data.id}\ndata: ${JSON.stringify(data)}\n\n`;
  for (const client of clients) {
    client.write(payload);
  }
}

// Example: forward CDC events to SSE clients
kafkaConsumer.on('message', (message) => {
  const event = JSON.parse(message.value);
  broadcast('change', {
    id: `${message.offset}`,
    table: event.source.table,
    operation: event.op,
    after: event.after,
  });
});

Client Implementation (Browser)

const eventSource = new EventSource('/events');

eventSource.addEventListener('change', (event) => {
  const data = JSON.parse(event.data);
  console.log(`${data.operation} on ${data.table}:`, data.after);
});

// Automatic reconnection is built in
// The browser sends Last-Event-ID header on reconnect
eventSource.addEventListener('error', (event) => {
  console.log('Connection lost, reconnecting...');
  // EventSource auto-reconnects with exponential backoff
});

Key Properties

PropertyDetails
DirectionServer → Client only
ProtocolHTTP/1.1 or HTTP/2
ReconnectionBuilt-in (browser sends Last-Event-ID)
Data formatText (typically JSON in data: fields)
Browser supportAll modern browsers
Proxy-friendlyYes — standard HTTP

When to Use SSE

  • Live dashboards showing database changes
  • Notification feeds
  • Log streaming UIs
  • Any case where data flows one way and you want simplicity

SSE is underrated. For most real-time UI use cases, you do not need bidirectional communication, and SSE gives you automatic reconnection for free.

Pattern 2: WebSockets

WebSockets provide full-duplex, bidirectional communication over a single TCP connection. After an HTTP upgrade handshake, the connection switches to the WebSocket protocol.

Server Implementation (Node.js)

const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });

const subscriptions = new Map(); // client -> Set of topics

wss.on('connection', (ws) => {
  subscriptions.set(ws, new Set());

  ws.on('message', (raw) => {
    const msg = JSON.parse(raw);

    if (msg.type === 'subscribe') {
      subscriptions.get(ws).add(msg.topic);
      ws.send(JSON.stringify({ type: 'subscribed', topic: msg.topic }));
    }

    if (msg.type === 'unsubscribe') {
      subscriptions.get(ws).delete(msg.topic);
    }
  });

  ws.on('close', () => {
    subscriptions.delete(ws);
  });

  // Heartbeat to detect stale connections
  ws.isAlive = true;
  ws.on('pong', () => { ws.isAlive = true; });
});

// Ping all clients every 30 seconds
setInterval(() => {
  wss.clients.forEach((ws) => {
    if (!ws.isAlive) return ws.terminate();
    ws.isAlive = false;
    ws.ping();
  });
}, 30000);

// Forward events to subscribed clients
function publishEvent(topic, event) {
  wss.clients.forEach((ws) => {
    if (ws.readyState === WebSocket.OPEN && subscriptions.get(ws)?.has(topic)) {
      ws.send(JSON.stringify({ topic, event }));
    }
  });
}

Client with Reconnection

class ReconnectingWebSocket {
  constructor(url, options = {}) {
    this.url = url;
    this.maxRetries = options.maxRetries || 10;
    this.baseDelay = options.baseDelay || 1000;
    this.retries = 0;
    this.lastEventId = null;
    this.handlers = new Map();
    this.connect();
  }

  connect() {
    const url = this.lastEventId
      ? `${this.url}?resumeFrom=${this.lastEventId}`
      : this.url;

    this.ws = new WebSocket(url);

    this.ws.onopen = () => {
      this.retries = 0;
      // Re-subscribe to topics after reconnection
      for (const [event, handler] of this.handlers) {
        this.ws.send(JSON.stringify({ type: 'subscribe', topic: event }));
      }
    };

    this.ws.onmessage = (msg) => {
      const data = JSON.parse(msg.data);
      if (data.event?.id) this.lastEventId = data.event.id;
      const handler = this.handlers.get(data.topic);
      if (handler) handler(data.event);
    };

    this.ws.onclose = () => this.reconnect();
    this.ws.onerror = () => this.ws.close();
  }

  reconnect() {
    if (this.retries >= this.maxRetries) return;
    const delay = this.baseDelay * Math.pow(2, this.retries);
    this.retries++;
    setTimeout(() => this.connect(), delay);
  }

  subscribe(topic, handler) {
    this.handlers.set(topic, handler);
    if (this.ws.readyState === WebSocket.OPEN) {
      this.ws.send(JSON.stringify({ type: 'subscribe', topic }));
    }
  }
}

Scaling WebSocket Connections

WebSocket connections are stateful — each connection is tied to a specific server process. This creates a scaling challenge:

  1. Sticky sessions: Use a load balancer with session affinity so reconnections hit the same server. Works for small deployments, but creates hot spots.

  2. Pub/sub backplane: Put Redis Pub/Sub or Kafka behind the WebSocket servers. Any server can serve any client because events are distributed through the shared backplane.

Client A ──> WS Server 1 ──┐
Client B ──> WS Server 2 ──┼──> Redis Pub/Sub <── Event Producer
Client C ──> WS Server 1 ──┘
  1. Connection limits: A single Node.js process handles 10,000-50,000 concurrent connections depending on message rate and payload size. Beyond that, add more processes.

When to Use WebSockets

  • Bidirectional communication (chat, collaborative editing)
  • Cases where clients need to send control messages (subscribe/unsubscribe, filters)
  • When you need the lowest possible latency and are willing to manage connection state

Pattern 3: gRPC Server Streaming

gRPC streaming uses HTTP/2 multiplexing and Protocol Buffers for efficient binary communication between services. It is primarily for internal service-to-service communication rather than browser-facing APIs.

Protocol Definition

syntax = "proto3";

service EventStream {
  // Client sends a subscription request, server streams events
  rpc Subscribe(SubscribeRequest) returns (stream ChangeEvent);
}

message SubscribeRequest {
  string topic = 1;
  string start_from = 2;  // Event ID to resume from
}

message ChangeEvent {
  string id = 1;
  string table = 2;
  string operation = 3;   // INSERT, UPDATE, DELETE
  bytes payload = 4;       // Serialized row data
  int64 timestamp = 5;
}

Server Implementation (Python)

import grpc
from concurrent import futures
import event_stream_pb2
import event_stream_pb2_grpc

class EventStreamServicer(event_stream_pb2_grpc.EventStreamServicer):
    def Subscribe(self, request, context):
        topic = request.topic
        start_from = request.start_from

        # Create a Kafka consumer for this client's stream
        consumer = create_consumer(topic, start_from)

        try:
            for message in consumer:
                if context.is_active():
                    event = event_stream_pb2.ChangeEvent(
                        id=str(message.offset),
                        table=message.key.decode(),
                        operation=message.headers.get('op', 'UNKNOWN'),
                        payload=message.value,
                        timestamp=message.timestamp,
                    )
                    yield event
                else:
                    break  # Client disconnected
        finally:
            consumer.close()

server = grpc.server(futures.ThreadPoolExecutor(max_workers=50))
event_stream_pb2_grpc.add_EventStreamServicer_to_server(
    EventStreamServicer(), server
)
server.add_insecure_port('[::]:50051')
server.start()

Key Properties

PropertyDetails
DirectionUnary, server-streaming, client-streaming, or bidirectional
ProtocolHTTP/2
SerializationProtocol Buffers (binary, strongly typed)
Flow controlBuilt-in HTTP/2 flow control
Browser supportRequires gRPC-Web proxy
BackpressureNative — receiver controls flow via HTTP/2 window updates

When to Use gRPC Streaming

  • Internal microservice communication
  • When you need strong typing and schema enforcement
  • High-throughput, low-latency service-to-service data distribution
  • When Protocol Buffers’ binary encoding matters for bandwidth

Pattern 4: Webhooks

Webhooks invert the connection model. Instead of clients connecting to your server, your server sends HTTP POST requests to URLs registered by the client.

Architecture

Event Source ──> Webhook Dispatcher ──> POST https://client-a.com/webhook
                                   ──> POST https://client-b.com/webhook

Dispatcher Implementation

import httpx
import asyncio
from dataclasses import dataclass

@dataclass
class WebhookSubscription:
    url: str
    secret: str
    events: list[str]

class WebhookDispatcher:
    def __init__(self, max_retries=5):
        self.subscriptions: list[WebhookSubscription] = []
        self.max_retries = max_retries
        self.client = httpx.AsyncClient(timeout=10.0)

    async def dispatch(self, event_type: str, payload: dict):
        tasks = []
        for sub in self.subscriptions:
            if event_type in sub.events:
                tasks.append(self.deliver(sub, event_type, payload))
        await asyncio.gather(*tasks, return_exceptions=True)

    async def deliver(self, sub: WebhookSubscription, event_type: str, payload: dict):
        import hmac, hashlib, json

        body = json.dumps(payload)
        signature = hmac.new(
            sub.secret.encode(), body.encode(), hashlib.sha256
        ).hexdigest()

        headers = {
            'Content-Type': 'application/json',
            'X-Event-Type': event_type,
            'X-Signature': f'sha256={signature}',
        }

        for attempt in range(self.max_retries):
            try:
                response = await self.client.post(sub.url, content=body, headers=headers)
                if response.status_code < 300:
                    return  # Success
                if response.status_code >= 500:
                    raise Exception(f"Server error: {response.status_code}")
                return  # 4xx — client error, don't retry
            except Exception:
                delay = min(2 ** attempt, 60)  # Exponential backoff, cap at 60s
                await asyncio.sleep(delay)

Webhook Challenges

No backpressure: The sender has no way to know if the receiver is overwhelmed until requests start failing. Rate limiting on the sender side is essential.

Delivery guarantees: Webhooks are at-least-once at best. If the receiver processes the event but crashes before sending a 200 response, the sender retries, and the receiver gets a duplicate. Receivers must be idempotent.

Ordering: With concurrent delivery and retries, events can arrive out of order. Include a sequence number or timestamp so receivers can detect and handle reordering.

When to Use Webhooks

  • Integrating with external systems that you do not control
  • Low-frequency events (order placed, user signed up, pipeline failed)
  • Cases where the receiver cannot maintain a persistent connection

Pattern 5: Long Polling

Long polling is the simplest approach — the client makes an HTTP request, and the server holds the request open until new data is available or a timeout expires.

// Client-side long polling
async function longPoll(cursor) {
  try {
    const response = await fetch(`/api/events?cursor=${cursor}&timeout=30`);
    const data = await response.json();

    for (const event of data.events) {
      processEvent(event);
    }

    // Immediately poll again with the new cursor
    longPoll(data.nextCursor);
  } catch (error) {
    // Retry with backoff
    setTimeout(() => longPoll(cursor), 5000);
  }
}

longPoll('latest');

Long polling works everywhere (any HTTP client), requires no special infrastructure, and naturally handles backpressure (the client controls the polling rate). The downsides: higher latency than persistent connections, more HTTP overhead, and each poll cycle has a reconnection cost.

When to Use Long Polling

  • Environments where WebSockets and SSE are blocked (corporate proxies)
  • Simple integrations where sub-second latency is not required
  • As a fallback when other protocols are not available

Choosing the Right Pattern

FactorSSEWebSocketgRPCWebhookLong Poll
DirectionServer→ClientBidirectionalAnyServer→ClientClient→Server
LatencyLowLowestLowMediumMedium-High
BackpressureNone (TCP)ManualNativeNoneNatural
ReconnectionBuilt-inManualManualN/ABuilt-in
Browser supportNativeNativeNeeds proxyN/ANative
Proxy-friendlyYesSometimesNeeds HTTP/2YesYes
Scaling difficultyLowMediumMediumLowLow
Best forDashboardsInteractive appsMicroservicesExternal integrationsFallback

Decision Guide

  1. Is the client a browser showing real-time data? Start with SSE. Only move to WebSockets if you need the client to send frequent messages back.

  2. Is this internal service-to-service? Use gRPC streaming for strong typing and flow control.

  3. Is the receiver an external system? Use webhooks. They are the standard for cross-organization event delivery.

  4. Can the client not hold a persistent connection? Use long polling.

  5. Do you need bidirectional communication with a browser? WebSockets.

Connection Management Best Practices

Regardless of which pattern you choose, these apply universally:

Heartbeats: Send periodic pings to detect dead connections. SSE can send comment lines (: ping), WebSockets have native ping/pong frames, gRPC has keepalive pings.

Cursor-based resumption: Every event should have an ID. When a client reconnects, it sends the last ID it received. The server resumes from that point. This is critical for data completeness during reconnections.

Graceful degradation: If the streaming connection fails, fall back to periodic REST API calls. Do not let a broken stream leave the client with stale data.

Authentication on reconnect: Re-authenticate on every reconnection. Tokens may have expired during the disconnection period. For WebSockets, pass the token in the initial HTTP upgrade or as the first message.

Buffering at the source: If a client disconnects, buffer recent events so it can catch up on reconnection without replaying the entire stream. A Kafka-backed architecture naturally provides this — events are retained in the topic and clients can seek to any offset.


Ready to build real-time APIs on top of streaming data? Streamkap handles the upstream pipeline — CDC, processing, and delivery to your streaming infrastructure — so you can focus on the API layer that serves your applications. Start a free trial or learn more about the platform.