Streamkap vs AWS DMS: Real-Time CDC Platform Comparison

AWS Database Migration Service (DMS) and Streamkap both offer Change Data Capture capabilities, but they were built for fundamentally different purposes. Understanding these differences is critical to choosing the right tool.

AWS DMS was designed primarily for database migrations—moving data from one database to another with minimal downtime. It’s evolved to support ongoing replication, but that’s not its core design focus.

Streamkap was built from the ground up for real-time CDC pipelines—continuously streaming database changes to data warehouses, lakes, and streaming platforms with sub-second latency.

Quick Comparison: Streamkap vs AWS DMS

AspectStreamkapAWS DMS
Primary Use CaseReal-time CDC pipelinesDatabase migrations
Data LatencySub-second to secondsSeconds to minutes
CDC MethodLog-based (Debezium)Log-based + polling
Stream ProcessingBuilt-in (Flink SQL/Python)None
Kafka IntegrationNative (included)Via Kinesis only
Destinations35+ (Snowflake, Databricks, etc.)Primarily AWS services
Multi-CloudYesAWS-centric
Schema EvolutionAutomaticLimited
PricingPer GB ($600+/mo)Per instance hour
Best ForProduction streaming pipelinesMigrations, AWS-to-AWS

Understanding AWS DMS

DMS Architecture

AWS DMS runs on replication instances—EC2 machines that host the replication engine:

[Source Database]

[DMS Replication Instance]

[Target Database/Service]

Key Components:

  • Replication Instance: EC2 instance running the DMS engine
  • Endpoints: Source and target database configurations
  • Replication Task: Defines what data to replicate and how
  • Table Mappings: Rules for selecting and transforming tables

DMS Capabilities

Migration Types:

  • Full Load: One-time migration of existing data
  • CDC Only: Capture ongoing changes (requires existing data)
  • Full Load + CDC: Migration followed by continuous replication

Supported Sources:

  • Oracle, SQL Server, MySQL, PostgreSQL, MongoDB
  • Amazon RDS, Aurora, DocumentDB
  • S3, Azure SQL, Google Cloud SQL (limited)

Supported Targets:

  • Amazon RDS, Aurora, Redshift
  • S3 (Parquet, CSV)
  • Kinesis Data Streams
  • Kafka (Amazon MSK)
  • DocumentDB, DynamoDB, Elasticsearch

DMS Limitations

DMS has several constraints that matter for real-time CDC:

  1. Latency: While DMS supports CDC, latency is typically seconds to minutes, not sub-second

  2. Instance-Based Architecture: Each replication task runs on an instance, creating scaling challenges

  3. Limited Transformations: Basic column mapping and filtering; no complex transformations

  4. AWS-Centric: Best for AWS-to-AWS scenarios; multi-cloud is cumbersome

  5. Schema Evolution: Limited automatic handling; often requires manual intervention

  6. No Stream Processing: Can’t process data in-flight; just replicates

  7. Destination Constraints: Modern data warehouses (Snowflake, Databricks) require additional steps

Understanding Streamkap

Streamkap Architecture

Streamkap uses a modern streaming architecture:

[Source Databases]

[Debezium CDC]

[Apache Kafka]

[Apache Flink] (optional transforms)

[Destinations]

Key Components:

  • Debezium: Industry-standard log-based CDC
  • Apache Kafka: Durable, ordered event streaming
  • Apache Flink: Real-time stream processing
  • Managed Infrastructure: Zero operational burden

Streamkap Capabilities

CDC Features:

  • Log-based CDC for all major databases
  • Sub-second latency
  • Transactional consistency
  • Complete capture (inserts, updates, hard deletes)
  • Automatic schema evolution

Destinations:

  • Cloud warehouses: Snowflake, Databricks, BigQuery, Redshift
  • OLAP databases: ClickHouse, Firebolt, StarRocks
  • Data lakes: S3, Iceberg, Delta Lake
  • Streaming: Kafka, Kinesis, Pub/Sub

Transformations:

  • SQL transforms via Flink
  • Python transforms
  • PII masking
  • Aggregations and enrichment

Deep Dive: Latency Comparison

AWS DMS Latency

DMS latency varies significantly based on configuration:

ScenarioTypical Latency
Full LoadHours (depends on data volume)
CDC (optimized)5-30 seconds
CDC (typical)30-120 seconds
CDC to S360-300 seconds
CDC to Kinesis5-30 seconds

Factors affecting DMS latency:

  • Instance size and type
  • Source database load
  • Network configuration
  • Batch settings
  • Target write performance

Streamkap Latency

Streamkap delivers consistent sub-second to single-digit second latency:

StageTypical Latency
Source → Kafka100-500ms
Kafka → Flink50-200ms
Flink → Destination500ms-2s
End-to-End1-3 seconds

This latency remains consistent regardless of:

  • Data volume
  • Time of day
  • Number of tables

For use cases like fraud detection, real-time personalization, or operational dashboards, this latency difference is critical.

CDC Method Comparison

How DMS Captures Changes

DMS uses different methods depending on the source:

Oracle: LogMiner or Binary Reader (supplemental logging required) SQL Server: MS-CDC or CT PostgreSQL: Logical replication (pglogical or test_decoding) MySQL: Binary log parsing

DMS also supports polling-based incremental capture for sources without native CDC.

Challenges:

  • Configuration complexity varies by source
  • Some methods have performance impact
  • Not all features available for all sources

How Streamkap Captures Changes

Streamkap uses Debezium, the industry standard for log-based CDC:

All Sources: Native transaction log reading

  • PostgreSQL: WAL via logical replication
  • MySQL: Binary log
  • SQL Server: Transaction log
  • Oracle: LogMiner
  • MongoDB: Oplog/Change Streams

Advantages:

  • Consistent experience across databases
  • Zero impact on source database performance
  • Complete capture including hard deletes
  • Transactional ordering guaranteed

Destination Support

AWS DMS Destinations

DMS works best with AWS services:

DestinationDMS SupportNotes
Amazon RDSExcellentNative integration
Amazon RedshiftGoodMay need staging
Amazon S3GoodCSV or Parquet
Amazon KinesisGoodEnables streaming
Amazon MSKGoodKafka endpoint
SnowflakeIndirectVia S3 + Snowpipe
DatabricksIndirectVia S3
BigQueryIndirectVia S3 + transfer
ClickHouseNot supported-

For modern cloud warehouses like Snowflake and Databricks, DMS requires a multi-step architecture:

[Source] → [DMS] → [S3] → [Snowpipe/COPY] → [Snowflake]

This adds latency, complexity, and failure points.

Streamkap Destinations

Streamkap offers native connectors for modern data platforms:

DestinationSupportLatency
SnowflakeNative (Snowpipe Streaming)1-3 seconds
DatabricksNative1-3 seconds
BigQueryNative (Storage Write API)1-3 seconds
ClickHouseNativeSub-second
RedshiftNative1-3 seconds
S3/IcebergNative1-3 seconds
KafkaNative (included)Sub-second
30+ moreNative-

No intermediate steps, no staging, no additional pipelines.

Stream Processing

AWS DMS Transformations

DMS offers limited transformation capabilities:

Supported:

  • Column selection
  • Column renaming
  • Basic filtering (WHERE-like rules)
  • Simple expressions
  • Table selection rules

Not Supported:

  • Complex transformations
  • Aggregations
  • Joins
  • Custom functions
  • PII masking
  • Data enrichment

For anything beyond basic mapping, you need additional services (Lambda, Glue, custom applications).

Streamkap Transformations

Streamkap includes Apache Flink for real-time stream processing:

SQL Transforms:

-- Mask PII and calculate metrics
SELECT
  id,
  REGEXP_REPLACE(email, '(.).*@', '$1***@') as masked_email,
  order_total,
  TUMBLE_END(event_time, INTERVAL '1' MINUTE) as window_end,
  SUM(order_total) as minute_total
FROM orders
GROUP BY id, email, order_total, TUMBLE(event_time, INTERVAL '1' MINUTE)

Python Transforms:

def transform(record):
    # Custom enrichment
    record['risk_score'] = calculate_risk(record)
    record['customer_segment'] = lookup_segment(record['customer_id'])
    return record

Use Cases:

  • PII masking before data leaves your VPC
  • Real-time aggregations
  • Data enrichment
  • Format conversions
  • Complex routing logic

Operational Comparison

Managing AWS DMS

DMS requires ongoing management:

Instance Management:

  • Right-sizing replication instances
  • Monitoring instance health
  • Handling instance failures

Task Management:

  • Monitoring replication lag
  • Handling task failures
  • Managing table mappings
  • CDC checkpoint management

Common Issues:

  • Instance out of storage
  • Replication lag during high load
  • Task failures requiring restart
  • Schema change handling

Estimated Effort: 2-6 hours/week for production workloads

Managing Streamkap

Streamkap is fully managed:

You Handle:

  • Defining sources and destinations
  • Configuring which tables to capture
  • Optional: Writing transforms

Streamkap Handles:

  • All infrastructure
  • Scaling
  • Monitoring
  • Failover and recovery
  • Upgrades

Estimated Effort: Minimal (configuration changes only)

Pricing Comparison

AWS DMS Pricing

DMS uses instance-based pricing:

Replication Instances (on-demand, us-east-1):

InstancevCPUMemoryPrice/Hour
dms.t3.micro21 GB$0.018
dms.t3.medium24 GB$0.073
dms.r5.large216 GB$0.210
dms.r5.xlarge432 GB$0.420
dms.r5.2xlarge864 GB$0.840

Additional Costs:

  • Data transfer (varies)
  • Storage for replication
  • Multi-AZ (doubles instance cost)
  • Premium support

Example: Production CDC with r5.large + Multi-AZ

  • Instance: $0.42/hr × 730 hrs = $307/month
  • Storage: ~$50/month
  • Data transfer: Variable
  • Total: $350-500+/month per replication task

For multiple sources or high throughput, costs multiply.

Streamkap Pricing

PlanPrice/MonthCapacityFeatures
Starter$60010GB/monthFull CDC
Scale$1,800150GB/month+ Transforms, SOC 2
EnterpriseCustomUnlimited+ HIPAA, PCI DSS

All-inclusive: No instance sizing, no data transfer charges, no storage fees.

Cost Comparison Example

Scenario: CDC from 3 PostgreSQL databases to Snowflake, 50GB/month

AWS DMS Approach:

  • 3× r5.large instances (Multi-AZ): $900/month
  • S3 staging: $50/month
  • Snowpipe: Compute costs
  • Ops time: 4 hrs/week @ $100/hr = $1,600/month
  • Total: ~$2,550/month

Streamkap Approach:

  • Scale plan: $1,800/month
  • Total: $1,800/month

Streamkap is simpler and often cheaper.

When to Choose AWS DMS

DMS is the right choice when:

  1. One-time database migrations: DMS excels at migrating databases with minimal downtime—its original purpose.

  2. AWS-to-AWS replication: Moving data between RDS instances or to Redshift within AWS is well-supported.

  3. You’re already deep in AWS: If your entire stack is AWS and you need simple replication, DMS integrates naturally.

  4. Homogeneous database migrations: Oracle-to-Oracle or PostgreSQL-to-PostgreSQL migrations are straightforward.

  5. You need heterogeneous migrations: DMS supports migrations between different database engines (e.g., Oracle to PostgreSQL).

  6. Budget is extremely tight: For simple use cases, DMS can be cheaper if you don’t factor in ops time.

When to Choose Streamkap

Streamkap is the better choice when:

  1. You need true real-time latency: Sub-second CDC is required for fraud detection, personalization, or operational use cases.

  2. Your destination is a modern data warehouse: Native connectors for Snowflake, Databricks, and BigQuery beat S3 staging.

  3. You need stream processing: In-flight transformations, PII masking, or aggregations require Flink.

  4. Multi-cloud is important: Streamkap works across AWS, GCP, Azure, and on-premises.

  5. You want Kafka in your architecture: CDC data is available as Kafka topics for other consumers.

  6. Schema evolution is common: Automatic handling of column additions and changes.

  7. You want to minimize ops burden: Fully managed with zero infrastructure to maintain.

  8. Your CDC workload is ongoing: DMS was built for migrations; Streamkap was built for continuous streaming.

Common Migration Patterns

Pattern 1: DMS for Migration, Streamkap for Ongoing

Use DMS for the initial database migration, then switch to Streamkap for real-time CDC:

  1. Phase 1: DMS handles full load migration
  2. Phase 2: Streamkap takes over for ongoing CDC
  3. Benefit: DMS’s migration strength + Streamkap’s streaming strength

Pattern 2: Streamkap for Everything

Skip DMS entirely and use Streamkap for both initial snapshot and ongoing CDC:

  1. Streamkap handles initial snapshot
  2. Seamless transition to continuous CDC
  3. Benefit: Single platform, no handoff complexity

Pattern 3: DMS for Legacy, Streamkap for Analytics

Keep DMS for AWS-to-AWS database replication, use Streamkap for analytics destinations:

  1. DMS handles RDS-to-RDS replication
  2. Streamkap streams to Snowflake, Databricks, etc.
  3. Benefit: Right tool for each job

Conclusion

AWS DMS and Streamkap serve different purposes:

AWS DMS is a solid choice for database migrations and basic AWS-to-AWS replication. It’s integrated into the AWS ecosystem and works well for its intended use case.

Streamkap is purpose-built for production real-time CDC pipelines. It delivers sub-second latency, native modern data warehouse support, and stream processing capabilities that DMS lacks.

For ongoing, real-time CDC workloads—especially to destinations like Snowflake, Databricks, or ClickHouse—Streamkap provides a more capable, often more cost-effective solution.


Ready to see real-time CDC beyond DMS? Start a free 30-day trial and experience sub-second latency to your data warehouse.

AUTHOR BIO
Ricky has 20+ years experience in data, devops, databases and startups.

PUBLISHED

January 27, 2025

TL;DR

AWS DMS is designed for database migrations and basic replication within AWS. Streamkap is purpose-built for real-time CDC with sub-second latency, stream processing, and multi-cloud support. Choose DMS for one-time migrations; choose Streamkap for production real-time data pipelines.