<--- Back to all resources

Engineering

February 25, 2026

9 min read

CDC to Redis: Real-Time Cache Invalidation and Sync

Use Change Data Capture to keep Redis caches perfectly in sync with your database. Eliminate stale cache problems, reduce read load, and build real-time cache layers.

TL;DR: • CDC-based cache sync eliminates the stale cache problem by streaming database changes directly to Redis, replacing error-prone application-level cache invalidation. • Two patterns dominate: cache invalidation (delete Redis key on change, let next read repopulate) and cache-through (write the new value to Redis on every change). • This pattern removes the need for TTL-based expiration hacks and dual-write code in your application layer.

There are only two hard things in computer science: cache invalidation and naming things. The joke has been around for decades, but cache invalidation remains a genuine source of production incidents, data inconsistency bugs, and late-night debugging sessions. The moment you put a Redis cache between your application and your database, you accept a tradeoff: faster reads in exchange for the risk that the cache and the database will disagree.

Most teams manage this tradeoff with application-level cache invalidation logic — writing code that updates or deletes Redis keys whenever the database changes. This approach is fragile, incomplete, and notoriously difficult to get right. Change Data Capture (CDC) offers a fundamentally better alternative: let the database itself tell Redis what changed, automatically and in real time, with no application code involved.

The Problem with Application-Level Cache Sync

The traditional approach to cache consistency looks straightforward on a whiteboard. Your application writes to the database, then immediately updates or deletes the corresponding Redis key. In practice, this dual-write pattern breaks in several ways.

Race Conditions

Two concurrent requests updating the same record can produce inconsistent results. Request A writes value 1 to the database, then Request B writes value 2 to the database. If Request B’s Redis update executes before Request A’s, the cache ends up holding value 1 while the database holds value 2. The cache is now stale, and it will stay stale until something else invalidates it.

Missed Invalidations

Not every database change flows through your application code. Background jobs, data migrations, administrative SQL queries, and other microservices can all modify data without triggering cache invalidation. A single missed code path means stale data served to users.

Partial Failures

The database write succeeds, but the Redis command fails due to a network timeout. Now the cache holds old data. Do you retry? Wrap both in a transaction? Redis does not participate in database transactions, so there is no atomic way to keep both in sync through application logic alone.

Code Complexity

Every write path in your application needs cache invalidation logic. As the codebase grows, this logic scatters across services, endpoints, and background workers. Keeping it consistent becomes a maintenance burden that scales with the size of the team.

CDC as the Solution

Change Data Capture sidesteps all of these problems by reading changes directly from the database’s transaction log (the write-ahead log in PostgreSQL, the binlog in MySQL, the oplog in MongoDB). Every committed INSERT, UPDATE, and DELETE is captured as an event and delivered to downstream consumers — including Redis.

Because CDC reads from the transaction log, it captures every change regardless of where it originated. Application code, cron jobs, manual SQL, and third-party integrations all write to the same log. CDC sees all of them.

The key insight is that CDC decouples cache sync from application logic entirely. Your application writes to the database. A separate CDC pipeline reads those changes and applies them to Redis. The application never needs to know that Redis exists.

┌──────────────┐     ┌──────────────┐     ┌──────────────┐
│  Application │────>│   Database   │────>│  CDC Engine   │
│   (writes)   │     │  (PostgreSQL)│     │  (log reader) │
└──────────────┘     └──────────────┘     └──────┬───────┘


                                          ┌──────────────┐
                                          │    Redis      │
                                          │   (cache)     │
                                          └──────────────┘

This architecture guarantees that every database commit is reflected in Redis, with no gaps, no race conditions, and no scattered invalidation code.

Two Patterns: Invalidation vs. Write-Through

When a CDC event arrives, you have two choices for how to apply it to Redis.

Cache Invalidation (Delete on Change)

The simpler pattern is to delete the Redis key whenever the corresponding database row changes. The next read request triggers a cache miss, fetches fresh data from the database, and repopulates the cache.

# CDC event handler - invalidation pattern
def handle_cdc_event(event):
    table = event['source']['table']
    key = event['key']  # primary key

    # Delete the cached entry
    redis_client.delete(f"{table}:{key}")

Advantages: Simple to implement, no risk of writing stale data to Redis, works well when reads are not extremely latency-sensitive.

Tradeoff: The first read after an invalidation incurs a cache miss and a database round trip.

Cache Write-Through (Update on Change)

The write-through pattern pushes the new value into Redis as part of the CDC event processing. The key is always warm, and reads never hit the database for recently changed data.

# CDC event handler - write-through pattern
def handle_cdc_event(event):
    table = event['source']['table']
    key = event['key']
    operation = event['op']  # 'c' (create), 'u' (update), 'd' (delete)

    if operation == 'd':
        redis_client.delete(f"{table}:{key}")
    else:
        value = event['after']  # the new row state
        redis_client.hset(f"{table}:{key}", mapping=value)

Advantages: Zero cache misses for recently changed data, consistently low read latency.

Tradeoff: Requires mapping database columns to Redis data structures, and every change generates a Redis write even if the key is rarely read.

Choose invalidation when your cache hit rate is moderate and simplicity matters. Choose write-through when low-latency reads are critical and you can afford the extra Redis write traffic.

Redis Data Structure Mapping

Redis offers several data structures, and choosing the right one depends on how your application queries the cached data.

Strings for Simple Values

Use SET and GET for single scalar values or serialized JSON blobs. This is the simplest mapping — one database row becomes one Redis string key.

SET user:1042 '{"name":"Alice","email":"alice@example.com","plan":"pro"}'

Hashes for Objects

Use HSET and HGETALL when your application reads individual fields from a cached object. Each database column maps to a hash field, and you can fetch specific fields without deserializing an entire JSON blob.

HSET user:1042 name "Alice" email "alice@example.com" plan "pro"
HGET user:1042 email  # returns "alice@example.com"

Sorted Sets for Rankings and Leaderboards

Use ZADD when the cached data represents a ranked list. For example, a leaderboard or a list of products sorted by price. The CDC event maps a numeric column to the score.

ZADD product_prices 29.99 "product:501"
ZADD product_prices 49.99 "product:502"
ZRANGEBYSCORE product_prices 0 40  # returns products under $40

Lists and Sets

Use Redis Lists for ordered sequences (recent activity feeds) and Sets for membership checks (feature flags, user groups). The choice depends on query patterns, not the source data shape.

Architecture: CDC to Kafka to Redis

The most common production architecture places Kafka (or a compatible streaming platform) between the CDC engine and Redis. This adds durability, replay capability, and the ability to fan out changes to multiple consumers.

┌────────────┐     ┌────────────┐     ┌────────────┐     ┌────────────┐
│  Database   │────>│ CDC Engine │────>│   Kafka    │────>│ Redis Sink │
│ (source)    │     │ (Debezium) │     │  (topic)   │     │ (consumer) │
└────────────┘     └────────────┘     └────────────┘     └────────────┘

Kafka provides a durable buffer so that if Redis is temporarily unavailable, events queue up and are applied once Redis recovers. It also allows other consumers (data warehouses, search indexes, analytics systems) to subscribe to the same change stream.

For simpler setups, a direct CDC-to-Redis pipeline without Kafka reduces operational overhead. Platforms like Streamkap provide managed CDC pipelines that handle the CDC capture, optional Kafka transport, and Redis sink connector as a single managed service — eliminating the need to operate Debezium, Kafka Connect, and custom consumers yourself.

Key Design: Mapping Database Keys to Redis Keys

A consistent Redis key naming convention is essential. The standard pattern is {table}:{primary_key}, which keeps keys predictable and debuggable.

# Single-column primary key
user:1042
order:88291
product:501

# Composite primary key
order_item:88291:3    # order_id:line_number
user_role:1042:admin  # user_id:role

For the write-through pattern, your CDC pipeline needs a key mapping configuration that specifies how to derive the Redis key from the CDC event. Most CDC-to-Redis connectors support a template syntax:

{
  "redis.key.format": "${source.table}:${key.id}",
  "redis.data.type": "HASH"
}

When using composite keys, concatenate the key fields with a delimiter. Keep the delimiter consistent across all tables to simplify monitoring and debugging.

Practical Example: User Profile Cache from PostgreSQL

Consider a common scenario: your PostgreSQL users table backs a user profile cache in Redis. The application reads profiles on every page load, and stale profiles cause visible UI bugs.

Step 1: Source table

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255),
    email VARCHAR(255),
    plan VARCHAR(50),
    updated_at TIMESTAMP DEFAULT NOW()
);

Step 2: CDC captures a change

When a user upgrades their plan, PostgreSQL writes to the WAL. The CDC engine captures the event:

{
  "op": "u",
  "before": { "id": 1042, "name": "Alice", "email": "alice@example.com", "plan": "free" },
  "after":  { "id": 1042, "name": "Alice", "email": "alice@example.com", "plan": "pro" },
  "source": { "table": "users", "db": "app_production" }
}

Step 3: Redis key is updated

The Redis sink connector processes the event and executes:

HSET user:1042 name "Alice" email "alice@example.com" plan "pro"

The cache now reflects the database state. No application code touched Redis. No dual-write logic. No race condition.

With Streamkap, this entire pipeline — from PostgreSQL WAL reading to Redis HSET — is configured through a web interface. You select PostgreSQL as the source, Redis as the destination, configure your key mapping, and the platform handles the rest.

TTL Strategy: When to Still Use TTL Alongside CDC

CDC provides strong consistency guarantees, but a TTL safety net is still good practice. There are edge cases where a TTL provides an additional layer of protection:

  • Pipeline outages: If the CDC pipeline goes down for an extended period, cached data ages without being refreshed. A TTL ensures stale keys eventually expire rather than persisting indefinitely.
  • Memory management: TTL prevents Redis memory from growing unboundedly if the dataset changes shape over time (rows deleted from the database but the delete event is missed or delayed).
  • Schema changes: When the database schema changes, cached entries in the old format should eventually expire.

Set TTLs generously — hours or days rather than minutes. The CDC pipeline handles real-time consistency; the TTL is a safety net, not the primary invalidation mechanism.

# Write-through with TTL safety net
def handle_cdc_event(event):
    key = f"user:{event['after']['id']}"
    redis_client.hset(key, mapping=event['after'])
    redis_client.expire(key, 86400)  # 24-hour safety TTL

Monitoring: Cache Hit Rates, Sync Lag, and Redis Memory

A CDC-to-Redis pipeline requires monitoring at three levels.

Cache Hit Rate

Track Redis INFO stats for keyspace_hits and keyspace_misses. A healthy write-through cache should show hit rates above 95%. If hit rates drop, investigate whether the CDC pipeline is lagging or whether your key mapping is incorrect.

redis-cli INFO stats | grep keyspace
# keyspace_hits:14523890
# keyspace_misses:72341

Sync Lag

Measure the time between a database commit and the corresponding Redis update. This is the end-to-end latency of your CDC pipeline. Monitor it as a metric and alert if it exceeds your SLA threshold (typically a few hundred milliseconds for cache use cases).

Streamkap provides built-in pipeline lag monitoring, showing the delay between source commit and destination write. This metric is critical for understanding whether your cache is serving near-real-time data or falling behind.

Redis Memory

Track used_memory and maxmemory in Redis. A write-through CDC pipeline can increase Redis memory usage since keys are always populated. Set maxmemory-policy to allkeys-lru or volatile-lru so that Redis evicts the least recently used keys when memory pressure builds, rather than rejecting writes.

redis-cli INFO memory | grep used_memory_human
# used_memory_human:2.34G

Combine these three dimensions — hit rate, lag, and memory — into a single dashboard. When any metric deviates from its baseline, you have a clear signal to investigate.

Key Takeaways

CDC-based cache sync replaces scattered, error-prone application-level invalidation logic with a single, reliable pipeline that reads from the database transaction log and writes to Redis. It captures every change regardless of origin, eliminates race conditions and dual-write failures, and removes cache management code from your application entirely.

The two dominant patterns — invalidation and write-through — give you flexibility to optimize for simplicity or latency depending on your use case. Combined with a TTL safety net and proper monitoring, CDC to Redis provides a production-grade caching layer that stays consistent without constant engineering attention.