<--- Back to all resources
PostgreSQL Logical Replication Internals: WAL, Slots, and Decoding
Deep dive into PostgreSQL logical replication internals covering WAL segments, logical decoding, replication slots, publications, and slot management.
PostgreSQL logical replication is the foundation of every CDC pipeline that reads from Postgres. Whether you are streaming changes to a data warehouse, feeding an event bus, or keeping a read replica in sync, the same internal machinery is at work: WAL decoding, replication slots, and the pgoutput protocol.
This article walks through how that machinery works at a systems level — the kind of detail you need when a replication slot is eating disk space at 3 AM or when you are trying to understand why a column change broke your downstream consumer.
The Write-Ahead Log: Where Everything Starts
Every data modification in PostgreSQL — INSERT, UPDATE, DELETE, and even schema changes — first goes into the Write-Ahead Log before it hits the heap (the actual table data files). The WAL is a sequence of fixed-size segment files (16 MB by default) stored in pg_wal/. Its primary job is crash recovery: if PostgreSQL crashes mid-transaction, it replays the WAL to restore consistency.
WAL records contain physical changes: which bytes changed at which page offset. This is great for crash recovery but useless for replication to external systems, because external consumers don’t understand PostgreSQL’s internal page format.
That is where logical decoding comes in.
Logical Decoding: From Bytes to Rows
Logical decoding is the process of translating physical WAL records back into logical row-level operations. PostgreSQL reads the raw WAL, reconstructs the transaction, and emits structured change events:
- BEGIN — transaction started
- INSERT into table X — new row with these column values
- UPDATE table X — old key columns + new column values
- DELETE from table X — old key columns
- COMMIT — transaction finished
The decoding process needs two things:
- A replication slot — to track which WAL position has been consumed
- An output plugin — to format the decoded changes
Output Plugins
PostgreSQL ships with pgoutput (the native logical replication protocol) and test_decoding (a debugging tool). The community maintains others:
| Plugin | Format | Use Case |
|---|---|---|
pgoutput | Binary protocol | Native logical replication, many CDC tools |
test_decoding | Text | Debugging, learning |
wal2json | JSON | CDC tools that want JSON |
decoderbufs | Protobuf | High-throughput CDC pipelines |
For most production CDC setups, pgoutput is the right choice. It is maintained as part of PostgreSQL core, supports publication-based filtering, and does not require installing extensions.
Replication Slots: The Bookmark
A replication slot is PostgreSQL’s mechanism for tracking a consumer’s position in the WAL stream. Without a slot, PostgreSQL would recycle old WAL segments as soon as they are checkpointed — and your consumer would lose its place.
Creating a Slot
-- Create a logical replication slot using pgoutput
SELECT pg_create_logical_replication_slot('my_cdc_slot', 'pgoutput');
-- Or using wal2json
SELECT pg_create_logical_replication_slot('my_json_slot', 'wal2json');
What the Slot Tracks
Each slot stores:
- slot_name — unique identifier
- plugin — which output plugin to use
- confirmed_flush_lsn — the last WAL position the consumer acknowledged
- restart_lsn — the earliest WAL position the slot still needs (PostgreSQL keeps WAL from this point forward)
SELECT slot_name, plugin, confirmed_flush_lsn, restart_lsn, active
FROM pg_replication_slots;
Output:
slot_name | plugin | confirmed_flush_lsn | restart_lsn | active
--------------+----------+---------------------+-------------+--------
my_cdc_slot | pgoutput | 0/1A3E8D0 | 0/1A3E800 | t
old_slot | pgoutput | 0/0F21400 | 0/0F21300 | f
That old_slot with active = f is a problem waiting to happen.
The Disk Bloat Problem
This is the single most common operational issue with PostgreSQL logical replication, and it catches teams off guard regularly.
The rule is simple: PostgreSQL will not delete any WAL segment that a replication slot still needs.
If a slot’s consumer disconnects — or if you create a slot and forget about it — WAL files pile up on disk. On a database doing 100 MB/minute of WAL writes, an abandoned slot will consume 144 GB of disk in a single day.
How to Detect It
-- Check WAL retention per slot (bytes behind current position)
SELECT
slot_name,
active,
pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn) AS bytes_behind,
pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)) AS lag_size
FROM pg_replication_slots
ORDER BY bytes_behind DESC;
slot_name | active | bytes_behind | lag_size
--------------+--------+--------------+-----------
old_slot | f | 15032876544 | 14 GB
my_cdc_slot | t | 24576 | 24 kB
How to Fix It
-- Drop an unused slot (this lets PostgreSQL recycle the retained WAL)
SELECT pg_drop_replication_slot('old_slot');
-- Check total WAL directory size
SELECT pg_size_pretty(sum(size)) AS total_wal_size
FROM pg_ls_waldir();
Prevention
Set max_slot_wal_keep_size (PostgreSQL 13+) to cap how much WAL a single slot can retain:
-- In postgresql.conf or via ALTER SYSTEM
ALTER SYSTEM SET max_slot_wal_keep_size = '10GB';
SELECT pg_reload_conf();
When a slot exceeds this limit, PostgreSQL invalidates it. The consumer will need to re-snapshot, but your disk stays alive. This is almost always the right tradeoff for production systems.
Publications and Subscriptions
Publications define which tables participate in logical replication. Subscriptions connect a downstream PostgreSQL (or CDC consumer) to a publication via a replication slot.
Creating a Publication
-- Publish specific tables
CREATE PUBLICATION my_pub FOR TABLE orders, customers, products;
-- Publish all tables in a schema (PostgreSQL 15+)
CREATE PUBLICATION schema_pub FOR TABLES IN SCHEMA public;
-- Publish everything
CREATE PUBLICATION all_pub FOR ALL TABLES;
-- Publish only INSERT and UPDATE (skip deletes)
CREATE PUBLICATION inserts_updates FOR TABLE orders
WITH (publish = 'insert, update');
Row Filters (PostgreSQL 15+)
-- Only replicate orders above $100
CREATE PUBLICATION high_value FOR TABLE orders
WHERE (total_amount > 100.00);
Checking What a Publication Covers
SELECT pubname, schemaname, tablename
FROM pg_publication_tables
WHERE pubname = 'my_pub';
Creating a Subscription
On the subscriber (downstream) PostgreSQL:
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=primary.db port=5432 dbname=myapp user=replicator'
PUBLICATION my_pub;
This automatically creates a replication slot on the publisher. The initial sync copies existing data, then the subscription switches to streaming mode for ongoing changes.
Replica Identity: What Gets Sent on UPDATE and DELETE
When PostgreSQL decodes an UPDATE or DELETE, it needs to identify which row changed. The replica identity controls what columns are included in the old-row portion of the change event.
-- Check current replica identity for a table
SELECT relname, relreplident
FROM pg_class
WHERE relname = 'orders';
| relreplident | Meaning | Sent on UPDATE/DELETE |
|---|---|---|
d (default) | Use primary key | Only PK columns in old row |
n (nothing) | Send nothing | No old row data (breaks most CDC) |
f (full) | Send all columns | Complete old + new row |
i (index) | Use a specific index | Columns of that index in old row |
For CDC pipelines, FULL replica identity gives you the most flexibility — you can see exactly what changed:
ALTER TABLE orders REPLICA IDENTITY FULL;
The tradeoff is WAL size. With FULL, every UPDATE writes all column values to the WAL, even columns that did not change. For wide tables (50+ columns), this can double or triple WAL volume. If your downstream only needs to identify the row (not diff old vs. new values), sticking with DEFAULT (primary key) is more efficient.
For a broader look at how PostgreSQL CDC fits into data pipelines, see our guide on PostgreSQL change data capture.
Monitoring Replication Lag
For Native Logical Replication
-- On the publisher: check subscriber lag
SELECT
application_name,
client_addr,
state,
sent_lsn,
write_lsn,
flush_lsn,
replay_lsn,
pg_wal_lsn_diff(sent_lsn, replay_lsn) AS replay_lag_bytes,
pg_size_pretty(pg_wal_lsn_diff(sent_lsn, replay_lsn)) AS replay_lag_pretty
FROM pg_stat_replication;
For CDC Replication Slots
-- Compare slot position against current WAL
SELECT
slot_name,
active,
pg_current_wal_lsn() AS current_lsn,
confirmed_flush_lsn,
pg_size_pretty(
pg_wal_lsn_diff(pg_current_wal_lsn(), confirmed_flush_lsn)
) AS slot_lag
FROM pg_replication_slots
WHERE slot_type = 'logical';
Key Metrics to Track
| Metric | Source | What It Tells You |
|---|---|---|
| Slot lag (bytes) | pg_replication_slots | How far behind the consumer is |
| WAL directory size | pg_ls_waldir() | Total disk used by WAL |
| Active slot count | pg_replication_slots | How many consumers are connected |
| Replication slot count | pg_replication_slots | Total slots (watch for orphans) |
| WAL generation rate | pg_stat_wal (PG 14+) | How fast WAL is being produced |
Set up alerts on slot lag. If a slot falls more than a few GB behind, investigate immediately. If you are running CDC pipelines at scale, monitoring lag is just as important as monitoring query performance. See data freshness monitoring for patterns on tracking pipeline latency end-to-end.
Configuration Checklist
Before enabling logical replication, verify these postgresql.conf settings:
# Required for logical decoding
wal_level = logical
# Max replication slots (default is 10, increase if needed)
max_replication_slots = 20
# Max WAL senders (one per slot + one per streaming replica)
max_wal_senders = 20
# Cap WAL retention per slot (PostgreSQL 13+)
max_slot_wal_keep_size = 10GB
After changing wal_level, you must restart PostgreSQL. The other settings require only a reload.
-- Check current wal_level (must be 'logical')
SHOW wal_level;
-- Reload config without restart
SELECT pg_reload_conf();
Common Pitfalls
1. Forgetting to set wal_level = logical
If wal_level is replica (the default), logical decoding silently won’t work. You need a restart to change this, so plan for it during a maintenance window.
2. Running out of max_replication_slots
Each CDC consumer needs its own slot. If you hit the limit, new slot creation fails. Set this higher than you think you need.
3. Long-running transactions blocking decoding Logical decoding cannot emit changes for a transaction until it commits. A transaction open for 6 hours means 6 hours of changes buffered in memory. Avoid long transactions on tables being replicated.
4. Tables without primary keys Without a primary key (or explicit replica identity index), UPDATE and DELETE operations cannot be decoded for logical replication. You will get errors or the table will be silently skipped. Check with:
SELECT c.relname
FROM pg_class c
LEFT JOIN pg_index i ON c.oid = i.indrelid AND i.indisprimary
WHERE c.relkind = 'r'
AND c.relnamespace = 'public'::regnamespace
AND i.indrelid IS NULL;
5. Schema changes during replication Adding a column is usually safe. Dropping or renaming a column while a consumer expects it will break downstream processing. See CDC schema evolution at zero downtime for a practical playbook.
How Managed CDC Platforms Handle This
If you are running your own PostgreSQL-to-Kafka CDC pipeline, you own all of this: slot management, WAL monitoring, output plugin configuration, and replica identity setup. It works, but it is a meaningful operational surface area.
Managed CDC platforms like Streamkap handle slot lifecycle, WAL monitoring, and automatic recovery under the hood. Streamkap’s CDC engine connects to your PostgreSQL instance, manages replication slots, and streams changes to your destination — you configure the source and destination, and the platform handles the plumbing in between.
The fundamentals in this article still matter either way. Understanding WAL internals and slot behavior helps you debug issues, size your PostgreSQL instance correctly, and make informed decisions about replica identity and schema design.
Ready to stream PostgreSQL changes without managing replication slots yourself? Streamkap connects to your PostgreSQL WAL, manages slot lifecycle automatically, and delivers changes to 20+ destinations in real time. Start a free trial or explore available connectors.