News

What’s New in Streamkap: 🧊 Apache Iceberg Connector

AUTHOR BIO
Product Marketing Manager at Streamkap

August 15, 2025

Streamkap now supports writing data to Apache Iceberg, the open table format for data lakes. This new connector enables real-time data streams to land directly in your data lake storage as Iceberg tables.

Why it matters

  • Fresh analytics — keep Iceberg tables in sync with source DB changes in near real time.
  • Low-latency pipelines — stream changes continuously instead of batch loading.
  • 💸 Cost-efficient & open — store data on object storage with open formats (Parquet + Iceberg).
  • 🧠 ACID + evolution — leverage Iceberg’s ACID transactions, schema and partition evolution, and time travel.
  • 🌐 Query anywhere — use Spark, Trino/Starburst, Flink, and more — all reading the same Iceberg tables via your catalog.

✔️ Enabled via

  1. In Streamkap, create (or edit) a Destination and select Apache Iceberg.
  2. Provide your storage (S3 bucket) and catalog (e.g., AWS Glue, Hive, or REST).
  3. Choose the database/namespace and table mapping.
  4. Save and start the pipeline — changes begin streaming into your Iceberg tables.
Docs: Apache Iceberg connector setup
Add connector: Add the Apache Iceberg connector to your pipeline
Tutorial: How to Stream AWS MySQL Data to Iceberg on AWS with Streamkap

✔️ How it works

  • Streamkap captures change events (CDC) from your sources and writes them as immutable files to Iceberg-managed tables.
  • The connector updates Iceberg table metadata atomically, providing consistent snapshots for readers.
  • Background maintenance-file sizing and compaction, snapshot expiration-is handled automatically, keeping query performance strong as data grows.
  • Analytics engines read the latest committed snapshot through the catalog, so BI/ML workloads see fresh, reliable data without batch windows.

👉 Ideal for

  • Real-time lakehouse: unify streaming ingestion with open table storage.
  • Warehouse → Iceberg migrations: run dual-write pipelines to validate without downtime.
  • Operational analytics & ML: query current data with familiar SQL engines.

Getting started

  • New to Iceberg? Start with small, high-value tables and enable the connector on a test namespace.
See our Iceberg FAQ or Why Apache Iceberg? A Guide to Real-Time Data Lakes in 2025.
  • Migrating? Use dual write (existing destination + Iceberg) to compare results and cut over confidently.
  • Production tips: keep catalogs healthy, monitor small-file counts, and enable periodic table optimization.
  • Start a free trial to stream into Iceberg in minutes.

If you have questions or want help sizing/partitioning, schedule a call — we’re happy to help you get the most out of Iceberg.