<--- Back to all resources

Architecture & Patterns

March 12, 2026

10 min read

Database Replication Patterns: Active-Active, CDC, and Beyond

A practical guide to database replication patterns — active-passive, active-active, CDC-based, snapshot, and multi-region. When to use each and common pitfalls.

TL;DR: Choose your replication pattern based on your consistency requirements, latency budget, and failure tolerance. CDC-based replication gives the best balance of low latency and low source impact for most modern architectures.

Every production database has the same question lurking behind it: what happens when this single node is not enough? Maybe you need read scaling. Maybe you need disaster recovery. Maybe you need the same data in a warehouse, a cache, and a search index — all at the same time.

That is where replication patterns come in. But picking the wrong one costs you months of rework and, worse, silent data loss during the transition. This guide walks through six replication patterns, when each one fits, and where each one breaks down.

Snapshot replication

Snapshot replication is the oldest and simplest approach. On a schedule — hourly, nightly, weekly — you take a full or incremental copy of the source database and load it into the target.

How it works. A process reads the current state of every table (or tables that changed since the last snapshot), serializes the data, and writes it to the destination. Tools range from pg_dump/pg_restore to managed ETL services that run on a cron.

Consistency guarantees. You get point-in-time consistency at the moment the snapshot was taken. Between snapshots, the target is stale. If a snapshot fails partway through, you either retry the whole thing or end up with a partially loaded dataset.

Latency. Measured in hours for most real deployments. Even “every 15 minutes” schedules rarely hold up once tables grow past a few hundred million rows.

Use cases. Reporting databases where freshness does not matter. Dev/staging environment seeding. Compliance archives where you need a dated copy.

Pitfalls. Snapshot replication puts heavy read load on the source during extraction. For large tables, that means locking or I/O contention during your extraction window. It also means your downstream consumers are always working with stale data — and the staleness window keeps growing as your data volume increases.

If you are currently running snapshots and feeling the pain of growing extraction times, the move to CDC-based replication is usually the first upgrade worth making.

Active-passive replication

Active-passive is the default replication pattern for most relational databases. PostgreSQL streaming replication, MySQL binary log replication, SQL Server Always On — they all follow this model.

How it works. One node (the primary) handles all writes. It streams its write-ahead log (WAL) or binary log to one or more replicas. The replicas apply those changes and serve read traffic.

Consistency guarantees. With synchronous replication, replicas are guaranteed to have every committed transaction before the primary acknowledges the commit. With asynchronous replication (more common in practice), replicas lag behind by milliseconds to seconds. You trade consistency for write throughput.

Latency. Replication lag is typically sub-second for async mode within the same region. Cross-region, expect 50-200ms depending on distance and network conditions. Synchronous mode adds the round-trip time to every write.

Use cases. Read scaling for web applications. High-availability failover. Offloading reporting queries from the primary.

Pitfalls. Failover is the hard part. Promoting a replica to primary sounds simple, but you need to handle:

  • Replication lag at failover time. If the replica was behind, you lose those transactions or need to reconcile them manually.
  • Application reconnection. Every connection pool, ORM, and service needs to discover the new primary.
  • Split-brain. If the old primary comes back online and some clients still point to it, you get divergent writes.

The other limitation: active-passive only replicates to the same database engine. Your PostgreSQL replica is another PostgreSQL instance. If you need data in Snowflake, Elasticsearch, or Redis, you need a different pattern.

Active-active replication

Active-active replication allows writes to multiple nodes simultaneously. Each node replicates its changes to the others. This is the pattern behind CockroachDB, YugabyteDB, and custom multi-master setups with MySQL Group Replication or PostgreSQL BDR.

How it works. Every node accepts reads and writes. Changes propagate between nodes either eagerly (consensus-based, like Raft or Paxos) or lazily (asynchronous with conflict resolution after the fact).

Consistency guarantees. Consensus-based systems give you strong consistency — linearizable reads and writes — but at the cost of latency on every write (you wait for a quorum). Asynchronous active-active gives you eventual consistency, meaning two nodes can temporarily disagree about the state of a row.

Latency. For consensus-based systems, write latency equals the round-trip to the slowest quorum member. Within a single region, that is 1-5ms. Across regions, 100-300ms per write — and that is the floor.

For async active-active, local write latency is fast, but conflicts appear later and need resolution.

Use cases. Applications that need write availability in multiple regions. Systems where a single primary is a bottleneck or a single point of failure you cannot tolerate.

Pitfalls. Conflict resolution is the monster under the bed. When two nodes update the same row at the same time, you need a strategy:

  • Last-writer-wins (LWW): Simple, but silently drops one write. Fine for counters, terrible for financial records.
  • Application-level resolution: You write custom merge logic. Works, but adds complexity to every write path.
  • CRDTs: Conflict-free replicated data types that merge automatically. Limited to specific data structures (counters, sets, registers).

Most teams that attempt DIY active-active on top of PostgreSQL or MySQL end up with a conflict resolution strategy that is “we hope conflicts don’t happen” — until they do, and it takes a week to untangle the data.

If you do not strictly need multi-region writes, active-passive with fast failover is almost always the better choice. It is boring in the right way.

CDC-based replication

Change Data Capture (CDC) reads the database’s transaction log and streams every insert, update, and delete to downstream targets in near-real-time. Unlike active-passive replication, CDC sends data to different systems — warehouses, caches, search indexes, event buses.

How it works. A CDC connector (Debezium, a managed CDC service like Streamkap, or a database-native feature) reads the WAL/binlog/oplog from the source database. It converts each change into an event and publishes it to a target — Kafka, a data warehouse, an API endpoint, or directly to a destination database.

The key distinction: CDC reads the log, not the tables. That means it imposes almost no additional load on the source database. No full table scans, no SELECT queries running during peak traffic.

Consistency guarantees. CDC preserves transaction ordering from the source log. If transaction A committed before transaction B, the CDC stream delivers A first. With exactly-once delivery (which platforms like Streamkap support), you get a faithful replica of every change.

Latency. Sub-second from commit on the source to delivery at the target. In practice, most CDC pipelines deliver changes in 200-500ms.

Use cases. This is the pattern with the broadest applicability:

  • Replicating PostgreSQL to Snowflake for analytics
  • Keeping a Redis cache in sync with your primary database
  • Feeding Elasticsearch for full-text search
  • Populating event-driven microservices with domain events
  • Building real-time dashboards from operational data

For a deeper look at the tools that support this pattern, see our guide to database replication tools.

Pitfalls. CDC is not zero-configuration:

  • Schema changes. When you ALTER TABLE on the source, the CDC pipeline needs to handle the new schema. Some tools break. Good ones (Streamkap included) handle schema evolution automatically.
  • Initial load. CDC captures changes going forward. For existing data, you need an initial snapshot — then switch to streaming mode. Getting this transition right without duplicating or dropping rows is tricky.
  • Log retention. If your CDC consumer falls behind and the database rotates its WAL/binlog, you lose events. You need to size log retention appropriately or have a way to re-snapshot.

Despite these, CDC has become the default replication pattern for any architecture that moves data between heterogeneous systems. The source impact is minimal, the latency is low, and the delivery guarantees are strong.

Multi-region replication

Multi-region replication is not a single pattern — it is any of the above patterns deployed across geographic regions. But the distance changes everything.

How it works. You pick a base pattern (active-passive or active-active) and deploy nodes across regions. Traffic is routed to the nearest region. Changes propagate between regions over the WAN.

Consistency guarantees. This is where the CAP theorem stops being academic. With synchronous cross-region replication, every write waits for confirmation from a remote region — adding 100-300ms to every transaction. With asynchronous replication, you get fast local writes but risk data loss if a region goes down before its changes propagate.

Most production deployments land on a middle ground: synchronous replication within a region (for durability) and asynchronous replication across regions (for performance), accepting that a regional failure may lose the last few seconds of writes.

Latency. Local reads and writes are fast. Cross-region consistency checks are slow. The speed of light is the constraint — San Francisco to Frankfurt is ~150ms round-trip, and no amount of engineering changes that.

Use cases. Global applications with users on multiple continents. Regulatory requirements that mandate data residency (EU data stays in EU, but needs to be queryable from US for global reports). Disaster recovery where a full regional outage should not cause downtime.

Pitfalls.

  • Conflict resolution at scale. If you chose active-active across regions, every pitfall from the active-active section applies, but now with higher latency making conflicts more likely.
  • Cost. Cross-region data transfer is expensive on every cloud provider. Replicating terabytes across the Atlantic adds up fast.
  • Operational complexity. Monitoring, alerting, failover testing, latency budgets — everything doubles or triples when you go multi-region.

The practical advice: start with a single-region active-passive setup with CDC streaming to your analytics and downstream systems. Go multi-region only when you have a clear business requirement that justifies the operational cost.

Hybrid replication

Most production architectures end up using multiple patterns together. That is not a design failure — it is the reality of running systems with different consistency and latency requirements.

A common hybrid setup looks like this:

  1. Active-passive replication between your primary PostgreSQL and two read replicas in the same region — for read scaling and fast failover.
  2. CDC-based replication from the primary to Snowflake — for analytics with sub-second latency.
  3. CDC-based replication from the primary to Redis — for cache invalidation.
  4. Snapshot replication nightly to a cold storage bucket — for compliance and long-term archival.

Each layer serves a different consumer with different freshness and consistency needs. The analytics team does not need the same latency as the cache layer. The compliance archive does not need real-time delivery at all.

The coordination challenge. When you run multiple replication patterns from the same source, you need to make sure they do not compete for resources. Two CDC connectors and a nightly snapshot all reading from the same PostgreSQL instance can cause WAL retention issues or replication slot buildup. A managed CDC platform handles this coordination for you — one connection to the source, multiple destinations fanned out downstream.

Pitfalls.

  • Monitoring fragmentation. Each replication path needs its own lag monitoring, error alerting, and health checks. Without centralization, you end up with blind spots.
  • Schema drift. A schema change on the source affects every downstream consumer. If your CDC pipeline handles it but your snapshot job does not, you get inconsistency across targets.
  • Debugging cross-system issues. When a downstream consumer has wrong data, you need to trace the problem through the specific replication path that feeds it. With four different paths, that is four different places to investigate.

Choosing the right pattern

Here is a decision framework that works for most teams:

Start with active-passive if you just need read replicas and failover for the same database engine. Every managed database service supports this out of the box.

Add CDC when you need data in a different system — a warehouse, a cache, a search index, an event bus. CDC gives you low latency with minimal source impact. If you are evaluating database replication tools, the CDC-capable ones should be at the top of your list.

Consider active-active only when you have a hard requirement for multi-region writes. The conflict resolution complexity is real, and most applications do not actually need it — they need fast reads in multiple regions, which active-passive with regional read replicas handles fine.

Use snapshots for cold storage, compliance, and environments where staleness is acceptable and operational simplicity matters more than freshness.

Go multi-region when your users or regulations demand it, not because it sounds impressive in an architecture doc.

The pattern you start with does not have to be the pattern you end with. Most teams grow from snapshots to active-passive to CDC as their needs evolve. The important thing is picking the pattern that matches your current requirements — and building on a foundation that does not lock you out of the next one.


Building a replication architecture? Streamkap handles CDC-based replication from 60+ databases to any destination — with exactly-once delivery and automatic schema evolution. Start a free trial or explore CDC capabilities.