News

What’s New in Streamkap: 🧊 Apache Iceberg Connector

Streamkap now supports writing data to Apache Iceberg, the open table format for data lakes. This new connector enables real-time data streams to land directly in your data lake storage as Iceberg tables.

Why it matters

  • Fresh analytics — keep Iceberg tables in sync with source DB changes in near real time.
  • Low-latency pipelines — stream changes continuously instead of batch loading.
  • 💸 Cost-efficient & open — store data on object storage with open formats (Parquet + Iceberg).
  • 🧠 ACID + evolution — leverage Iceberg’s ACID transactions, schema and partition evolution, and time travel.
  • 🌐 Query anywhere — use Spark, Trino/Starburst, Flink, and more — all reading the same Iceberg tables via your catalog.

✔️ Enabled via

  1. In Streamkap, create (or edit) a Destination and select Apache Iceberg.
  2. Provide your storage (S3 bucket) and catalog (e.g., AWS Glue, Hive, or REST).
  3. Choose the database/namespace and table mapping.
  4. Save and start the pipeline — changes begin streaming into your Iceberg tables.
Docs: Apache Iceberg connector setup
Add connector: Add the Apache Iceberg connector to your pipeline
Tutorial: How to Stream AWS MySQL Data to Iceberg on AWS with Streamkap

✔️ How it works

  • Streamkap captures change events (CDC) from your sources and writes them as immutable files to Iceberg-managed tables.
  • The connector updates Iceberg table metadata atomically, providing consistent snapshots for readers.
  • Background maintenance-file sizing and compaction, snapshot expiration-is handled automatically, keeping query performance strong as data grows.
  • Analytics engines read the latest committed snapshot through the catalog, so BI/ML workloads see fresh, reliable data without batch windows.

👉 Ideal for

  • Real-time lakehouse: unify streaming ingestion with open table storage.
  • Warehouse → Iceberg migrations: run dual-write pipelines to validate without downtime.
  • Operational analytics & ML: query current data with familiar SQL engines.

Getting started

  • New to Iceberg? Start with small, high-value tables and enable the connector on a test namespace.
See our Iceberg FAQ or Why Apache Iceberg? A Guide to Real-Time Data Lakes in 2025.
  • Migrating? Use dual write (existing destination + Iceberg) to compare results and cut over confidently.
  • Production tips: keep catalogs healthy, monitor small-file counts, and enable periodic table optimization.
  • Start a free trial to stream into Iceberg in minutes.

If you have questions or want help sizing/partitioning, schedule a call — we’re happy to help you get the most out of Iceberg.

AUTHOR BIO
Product Marketing Manager at Streamkap

PUBLISHED

August 15, 2025

TL;DR

• Streamkap now supports writing data to Apache Iceberg for real-time lakehouse architectures on S3. • Keep Iceberg tables in sync with source database changes in near real-time with ACID transactions and schema evolution. • Query your Iceberg tables from Spark, Trino, Flink, or other engines-all reading the same consistent data via your catalog.