APACHE ICEBERG

Real-time CDC to Open Lakehouse

Stream database changes directly to Apache Iceberg tables. Open table format with ACID transactions, time travel, and compatibility with Spark, Trino, and every major query engine.

Open Table Format, Zero Lock-in

Write once, query from anywhere with full transactional guarantees

Open Table Format

No vendor lock-in. Query Iceberg tables from Spark, Trino, Presto, Dremio, Athena, and more.

ACID Transactions

Full transactional guarantees for concurrent reads and writes. No corrupt data.

Time Travel

Query historical snapshots. Roll back to any point in time for debugging or compliance.

Schema Evolution

Add, rename, or drop columns without rewriting data. Changes flow automatically.

Hidden Partitioning

Automatic partition pruning without exposing partitions in queries.

Any Cloud Storage

Write to S3, GCS, Azure Blob, MinIO, or any S3-compatible storage.

Supported Catalogs

Connect to your existing metadata catalog

AWS Glue

Native Glue Data Catalog integration

Hive Metastore

Self-hosted Hive Metastore

REST Catalog

Tabular, Nessie, or custom REST

Snowflake

Polaris Catalog integration

What You Can Build

Build open data lakehouses without vendor lock-in
Enable multi-engine analytics (Spark + Trino + Athena)
Historical data analysis with time travel queries
Cost-effective analytics on cloud object storage
ML training data with point-in-time consistency

How It Works

Streamkap captures changes from your source databases using CDC and streams them directly to Apache Iceberg tables on your cloud object storage. Query with any engine.

  • 1. Capture: CDC captures changes from PostgreSQL, MySQL, MongoDB, and more
  • 2. Transform: Apply SQL transforms, filtering, and masking in-flight
  • 3. Write: Data written to Iceberg tables with ACID guarantees
  • 4. Query: Use Spark, Trino, Athena, or any Iceberg-compatible engine
Source DB PostgreSQL, MySQL, etc.
Streamkap CDC + Transform
Iceberg S3 / GCS / Azure

Query with Any Engine

Iceberg's open format works with every major query engine

Apache Spark
Trino
Presto
Amazon Athena
Dremio
Snowflake
Databricks
StarRocks

Start streaming to Apache Iceberg today

Build an open lakehouse with real-time CDC. Free tier available.