<--- Back to all resources

Engineering

February 25, 2026

11 min read

CQRS and Stream Processing: Separating Reads and Writes at Scale

How to implement CQRS using CDC and stream processing. Build optimized read models from write-side changes in real time with Kafka and Flink.

TL;DR: • CQRS separates the write model (optimized for transactions) from read models (optimized for queries). • CDC provides the bridge - capturing writes and streaming them to build read-side projections. • Stream processing with Flink transforms and enriches the change stream before materializing read models. • This pattern scales reads independently from writes and supports multiple read model formats.

Most applications start with a single database that handles both reads and writes. You insert a row, then query it back from the same table. It works well enough at small scale, but the cracks appear as traffic grows and query patterns diverge from your write schema.

Your transactional tables are normalized for write correctness - foreign keys, constraints, minimal redundancy. But your read queries want something different: denormalized views, pre-computed aggregates, full-text search indexes. You end up adding read replicas, caching layers, and materialized views as afterthoughts, each one a band-aid on a structural mismatch.

CQRS (Command Query Responsibility Segregation) addresses this mismatch directly. Instead of forcing one data model to serve two fundamentally different access patterns, you separate them. The write side stays optimized for transactions. The read side gets its own models, purpose-built for whatever queries you need to serve. And the bridge between them? That is where CDC and stream processing come in.

The Core Idea Behind CQRS

The term was coined by Greg Young, building on Bertrand Meyer’s Command Query Separation (CQS) principle. CQS says that a method should either change state or return a result, never both. CQRS takes this further and applies it at the architectural level: the model you write to and the model you read from are separate systems.

On the write side, you have your transactional database - PostgreSQL, MySQL, MongoDB, whatever fits your domain. Your application issues commands (create order, update inventory, cancel subscription) that modify this database. The schema is normalized, enforces constraints, and prioritizes write throughput and data integrity.

On the read side, you have one or more stores optimized for specific query patterns. An Elasticsearch cluster for full-text search. A Redis cache for low-latency key lookups. A denormalized PostgreSQL schema for analytical dashboards. A ClickHouse instance for OLAP queries. Each store holds a projection of the write-side data, shaped exactly for the queries it needs to serve.

The key question is: how do changes on the write side get propagated to the read side? This is where the pattern gets interesting - and where many teams get stuck.

CDC as the Bridge Between Write and Read

The traditional approach to syncing read models is dual writes - your application writes to the primary database and also writes to each read store. This is fragile. If the write to Elasticsearch succeeds but the write to Redis fails, your read models are inconsistent. You need distributed transactions or complex retry logic, and you have coupled your application to every downstream store.

Change Data Capture (CDC) solves this cleanly. Instead of making your application aware of read-side stores, you capture changes directly from the database’s transaction log. PostgreSQL has its write-ahead log (WAL). MySQL has the binlog. MongoDB has the oplog. These logs already exist for replication and crash recovery - CDC just reads them and publishes the changes as a stream of events.

The flow looks like this:

  1. Application writes to PostgreSQL (the write side).
  2. CDC captures the change from the WAL and publishes it to Kafka.
  3. Stream processors consume from Kafka, transform the events, and write to read-side stores.

Your application only talks to one database. It has no knowledge of Elasticsearch, Redis, or any other read store. The CDC pipeline handles propagation asynchronously, and because it reads from the transaction log, it captures every committed change - including those made by batch jobs, migrations, or direct SQL updates that bypass your application layer.

This is the pattern that Streamkap is built around. Streamkap handles the CDC capture from databases like PostgreSQL, MySQL, and MongoDB, streams changes through Kafka, and delivers them to downstream destinations - all without requiring your application to change how it writes data.

Stream Processing: Transforming Changes Into Read Models

Raw CDC events are rarely in the shape you need for a read model. A CDC event from your orders table contains the columns of that table - but your search index might need data joined from orders, customers, and products. Your analytics view might need aggregations. Your cache might need a flattened, denormalized document.

This is where stream processing with Apache Flink fits in. Flink sits between Kafka (where CDC events land) and your read-side stores, transforming the raw change stream into whatever shape each read model requires.

Joining Across Streams

One of the most common transformations is joining data from multiple tables. In a traditional database, you would write a JOIN query. In a streaming context, Flink maintains state for each stream and joins them as events arrive.

For example, say you have CDC streams for orders, customers, and order_items. Flink can join these three streams to produce a denormalized order document that includes customer name, email, and all line items - exactly what your Elasticsearch index needs for a search result.

-- Flink SQL example: joining order data with customer info
SELECT
  o.order_id,
  o.status,
  o.created_at,
  c.name AS customer_name,
  c.email AS customer_email,
  oi.product_id,
  oi.quantity,
  oi.price
FROM orders_cdc o
JOIN customers_cdc c ON o.customer_id = c.customer_id
JOIN order_items_cdc oi ON o.order_id = oi.order_id;

Filtering and Enrichment

Not every change needs to reach every read model. Flink lets you filter events before they hit the destination. Maybe your Redis cache only needs active orders, or your analytics store only cares about orders above a certain value. You can also enrich events with static reference data - adding product category names, region codes, or currency conversions.

Aggregation

Some read models need pre-computed aggregates. Instead of your dashboard running a COUNT(*) GROUP BY status query against millions of rows every time it loads, Flink can maintain running counts and push the aggregated result to a read store. The dashboard then reads a single row instead of scanning the full table.

Example Architecture: E-Commerce Order System

Let’s walk through a concrete example. You run an e-commerce platform with PostgreSQL as your primary database. Your write schema looks like standard normalized tables: orders, order_items, customers, products.

Your application needs to serve three very different read patterns:

1. Order search (Elasticsearch) Customers and support agents search orders by customer name, product, date range, and status. This requires full-text search with filters - something PostgreSQL can do, but not as well or as fast as Elasticsearch at scale.

2. Order status lookup (Redis) The order tracking page shows the current status of a single order. This is a high-frequency, low-latency read - a perfect fit for Redis, where you store a simple key-value mapping of order_id to status and last-updated timestamp.

3. Sales dashboard (ClickHouse or denormalized PostgreSQL) Internal teams need real-time sales metrics: revenue by region, orders per hour, average order value by product category. This is an OLAP workload that benefits from a columnar store or pre-aggregated tables.

Here is how the architecture flows:

PostgreSQL (write side)
    |
    | CDC (WAL capture)
    v
  Kafka
    |
    ├── Flink Job 1: Join orders + customers + items → Elasticsearch
    |
    ├── Flink Job 2: Filter to status changes → Redis
    |
    └── Flink Job 3: Aggregate by region/hour → ClickHouse

Each Flink job is independent. If the Elasticsearch pipeline goes down, Redis and ClickHouse keep receiving updates. If you add a fourth read model next month - say, a recommendation engine - you add a new Flink job that consumes from the same Kafka topics. No changes to your application or write-side database.

With Streamkap, the CDC-to-Kafka leg of this pipeline is fully managed. You configure the PostgreSQL source, point it at your Kafka cluster (or use Streamkap’s managed Kafka), and changes start flowing. Streamkap also supports writing directly to destinations like Elasticsearch, ClickHouse, and Redis, so for simpler projections that do not need Flink transformations, you can skip the stream processing layer entirely.

Handling Eventual Consistency

The biggest trade-off with CQRS is that the read side is eventually consistent with the write side. After a user places an order, there is a window - typically milliseconds to low seconds with a well-tuned pipeline - where the order exists in PostgreSQL but has not yet appeared in Elasticsearch or Redis.

This is not a bug. It is a design choice you need to account for in your application.

Practical Strategies

Read-your-writes for the acting user. After a user creates an order, redirect them to a page that reads from the write-side database (or from the command response itself) rather than from the read model. Subsequent page loads can use the read model, since the pipeline will have caught up by then.

Optimistic UI updates. On the frontend, show the expected state immediately after the action completes. The user sees their new order in the list because the UI added it optimistically, not because it queried the read store.

Version or timestamp checks. Include a version number or timestamp in both the write response and the read model. If the client knows it just wrote version 5 but the read model returns version 4, it can retry after a short delay.

Design for lag tolerance. Some read models do not need instant consistency at all. An analytics dashboard that refreshes every 10 seconds can tolerate a few seconds of lag without anyone noticing. Match your consistency requirements to actual user expectations, not theoretical ideals.

Monitoring Pipeline Lag

Whatever strategy you use, you need visibility into the lag between write and read. Track the offset lag on your Kafka consumer groups. Measure the time delta between when a change was committed to PostgreSQL and when it appeared in the read store. Set alerts for when lag exceeds your SLA. Streamkap provides built-in lag monitoring for its pipelines, so you can track propagation delay without building custom instrumentation.

When CQRS is Worth the Complexity

CQRS is not a default architecture. It adds moving parts: a CDC pipeline, a message broker, stream processing jobs, multiple data stores. Each component needs monitoring, scaling, and operational attention. For a straightforward CRUD application with moderate traffic and simple query patterns, a single PostgreSQL database with proper indexes will outperform and outlast a distributed CQRS setup.

But certain conditions make the trade-off clearly worthwhile:

Divergent read and write patterns. If your write schema is third normal form but your read queries need six-table joins, you are fighting the data model on every read. Separate read projections eliminate this tension.

Multiple read model formats. You need the same data in Elasticsearch for search, Redis for caching, and ClickHouse for analytics. Maintaining these through application-level dual writes is a consistency nightmare. CDC and stream processing give you a single source of truth with multiple derived views.

High read-to-write ratios. If your system handles 100 reads for every write, scaling the read side independently makes economic sense. You can add Elasticsearch replicas or Redis nodes without touching the write-side database.

Independent scaling requirements. During a flash sale, your read traffic might spike 10x while writes increase 2x. With CQRS, you scale each side according to its own load profile.

Audit and replay needs. Because CDC captures a complete log of changes, you can replay the stream to rebuild read models, backfill new projections, or debug issues by examining the exact sequence of events.

Implementation Checklist

If you are considering CQRS with CDC and stream processing, here is a practical sequence:

  1. Start with one read model. Do not build three projections on day one. Pick the highest-value read pattern - usually search or a performance-critical dashboard - and build the pipeline for that.

  2. Get CDC running first. Before thinking about Flink transformations, make sure raw CDC events are flowing reliably from your database to Kafka. Validate that you are capturing all change types (inserts, updates, deletes) and handling schema changes.

  3. Add transformations incrementally. Start with a simple pass-through pipeline that writes CDC events directly to the read store. Then layer in joins, filters, and aggregations as you validate each step.

  4. Monitor lag from the start. Do not wait until production to instrument your pipeline. Build lag monitoring and alerting into your initial deployment.

  5. Handle schema evolution. Your write-side schema will change. Make sure your CDC pipeline and Flink jobs handle new columns, dropped columns, and type changes without breaking. Streamkap supports automatic schema change management, which removes much of this operational burden.

  6. Plan for replay. Retain enough Kafka history (or use tiered storage) to rebuild read models from scratch when needed - whether for bug fixes, new projections, or disaster recovery.

Tying It All Together

CQRS with CDC and stream processing is not about chasing architectural elegance for its own sake. It is about matching your data infrastructure to the reality of how modern applications access data: writes are transactional and normalized, reads are diverse and denormalized, and the gap between them grows wider as systems scale.

CDC gives you a reliable, decoupled bridge between the two sides. Stream processing with Flink gives you the flexibility to shape change events into whatever form each read model needs. And the entire pattern scales horizontally - more read traffic means more read replicas and consumer instances, without adding pressure to your write-side database.

The tooling has matured to the point where you do not need to build and operate every layer yourself. Managed CDC platforms like Streamkap handle the capture and delivery of changes, letting you focus on the transformation and read-model logic that is specific to your application. Start with one projection, prove the value, and expand from there.