<--- Back to all resources

Engineering

February 25, 2026

11 min read

Streaming Pipeline Cost Optimization: Getting More for Less

A practical guide to reducing the cost of real-time streaming pipelines. Covers infrastructure sizing, partition tuning, compression, tiered storage, managed vs self-hosted cost tradeoffs, and monitoring spend.

TL;DR: • Right-size your Kafka brokers by measuring actual CPU, memory, and disk usage rather than provisioning based on peak estimates. • Partition count, compression codec, and retention policy are the three most impactful levers for controlling infrastructure cost. • Tiered storage offloads cold data to object storage at a fraction of the cost of broker-attached SSD volumes. • Managed platforms like Streamkap often cost less than self-hosted setups once you account for engineering time, on-call burden, and infrastructure overhead.

Running a real-time streaming pipeline is not free. Kafka brokers, connectors, stream processors, and destination sinks all consume compute, storage, and network bandwidth. For small pipelines, these costs are negligible. But as you scale to dozens of topics, hundreds of partitions, and terabytes of daily throughput, the monthly bill starts demanding attention.

The good news is that most streaming pipelines have significant cost optimization opportunities hiding in plain sight. This guide covers the practical techniques that make the biggest difference, from infrastructure sizing to compression to the managed-vs-self-hosted decision.

Infrastructure Sizing: Stop Over-Provisioning

The most common cost mistake in streaming is over-provisioning Kafka brokers. Teams estimate peak throughput, add a generous safety margin, pick a large instance type, and never revisit the decision. Six months later, those brokers are running at 15% CPU utilization while burning through budget.

Measure Before You Size

Before choosing instance types, collect actual metrics from your pipeline. If you are already running, look at the last 30 days of data for:

  • CPU utilization. Kafka is surprisingly CPU-light for most workloads. Unless you are doing heavy compression or TLS termination on the broker, you rarely need more than 4-8 cores per broker.
  • Network throughput. This is often the real bottleneck. Each message is received by the broker (ingress), replicated to follower brokers (internal traffic), and served to consumers (egress). The total network load is roughly ingress * (1 + replication_factor + consumer_count). Size your instances for network bandwidth first.
  • Disk throughput and IOPS. Kafka is a sequential write workload, which is forgiving on disk. But if you are running on EBS volumes in AWS, check your IOPS consumption. You might be paying for gp3 volumes when io2 is not needed, or you might be hitting IOPS limits on gp2 volumes and paying the latency penalty.
  • Memory. Kafka uses heap memory for the JVM (6-8 GB is typical) and the rest for the OS page cache. More page cache means more data served from memory rather than disk. 32 GB of total RAM per broker is a common sweet spot, but measure your cache hit rate to verify.

Right-Size Your Instance Types

Cloud providers offer a bewildering array of instance types. For Kafka brokers, you almost always want a network-optimized or storage-optimized instance, not a general-purpose one.

On AWS, for example:

WorkloadGood FitWhy
Moderate throughput (< 200 MB/s per broker)m6i.xlarge or m6i.2xlargeBalanced compute and network
High throughput (200-500 MB/s per broker)r6i.2xlarge or r6i.4xlargeMore memory for page cache
Very high throughput (> 500 MB/s per broker)i3en.xlarge or i3en.2xlargeNVMe local storage, high network

The pattern is similar on GCP and Azure. The key is to pick the smallest instance that comfortably handles your measured workload, with enough headroom for traffic spikes. A 30-40% buffer above your p95 utilization is reasonable. Anything more is wasted spend.

Scale Horizontally, Not Vertically

If you need more capacity, add brokers rather than upsizing existing ones. Kafka is designed to scale horizontally. Adding a broker and rebalancing partitions across the cluster is cleaner and often cheaper than jumping to a larger instance type. Four m6i.xlarge brokers frequently cost less than two m6i.4xlarge brokers while providing better fault tolerance.

Partition Count Tuning

Partitions are Kafka’s unit of parallelism. More partitions let you run more consumer instances in parallel, which increases throughput. But each partition has a cost, and over-partitioning is one of the most common sources of waste.

The Cost of Each Partition

Each partition on a broker consumes:

  • Memory. Index files and log segment metadata are cached in memory. With thousands of partitions per broker, this adds up.
  • File handles. Each partition has multiple open file handles for its log segments. Linux defaults to 1024 open files per process, which is nowhere near enough for a busy broker. You need to increase this, and even after increasing it, each handle has a small cost.
  • Replication bandwidth. Every partition is replicated to replication_factor - 1 other brokers. More partitions mean more replication traffic.
  • Leader election time. When a broker fails, the controller elects new leaders for all the partitions that broker was leading. With 10,000 partitions, this can take minutes, during which those partitions are unavailable.

Guidelines for Partition Count

A good starting point is to target 1 MB/s of throughput per partition. If your topic receives 10 MB/s of data, 10-12 partitions is a reasonable starting point. If your topic receives 100 KB/s, you probably need only 1-3 partitions.

The other factor is consumer parallelism. If you have 8 consumer instances in a consumer group, you need at least 8 partitions for the topic, or some consumers will sit idle. But you do not need 100 partitions “just in case.” Adding partitions later is possible (though it does break key-based ordering guarantees for existing keys), so start conservatively.

Audit your existing topics. It is common to find topics with 50 or 100 partitions that receive a trickle of data. These are prime candidates for consolidation. Reducing partition count on existing topics requires creating a new topic with fewer partitions and migrating consumers, which is why getting the count right from the start matters.

Compression: Cheap and Effective

Enabling compression on your Kafka producers is one of the highest-impact, lowest-effort optimizations you can make. Compressed messages use less network bandwidth, less disk space, and less replication traffic. The only cost is CPU cycles on the producer and consumer, and for modern codecs, that cost is minimal.

Choosing a Codec

CodecCompression RatioCPU CostBest For
GZIPHigh (70-80% reduction)HighCold storage, archival topics
SnappyModerate (50-60% reduction)LowGeneral-purpose, latency-sensitive
LZ4Moderate (50-60% reduction)Very lowHigh-throughput, latency-sensitive
ZSTDHigh (65-75% reduction)ModerateBest balance of ratio and speed

For most production workloads, ZSTD or LZ4 are the right choices. ZSTD gives you better compression ratios with acceptable CPU overhead, which translates directly into lower storage and network costs. LZ4 is better when you are CPU-constrained or need the absolute lowest producer latency.

Set compression at the producer level:

compression.type=zstd

The broker can be configured to preserve the producer’s compression or recompress, but in most cases you want the broker to accept the compressed batches as-is to avoid spending broker CPU on recompression.

Measure the Impact

After enabling compression, monitor:

  • Broker disk usage. You should see a meaningful reduction in daily growth rate.
  • Network throughput. Both ingress and replication traffic should drop.
  • Producer and consumer CPU. A small increase is expected. If it is more than 5-10%, consider switching to a lighter codec.

For JSON and Avro payloads, compression ratios of 60-80% are common. That means your 100 GB/day topic might use only 20-40 GB/day of disk after compression. Over a year, the storage savings alone can be significant.

Retention Policy and Tiered Storage

Kafka’s default retention is 7 days (retention.ms=604800000). For many topics, this is either too long or too short. Getting retention right is a direct lever on storage cost.

Setting Retention Based on Consumer Needs

Ask yourself: how far back does any consumer actually need to read? If your fastest consumer keeps up in real time and your slowest consumer is a batch job that runs every 6 hours, you need at most 12-24 hours of retention (with some buffer for failures and reprocessing).

Reducing retention from 7 days to 1 day on a high-volume topic cuts your storage requirement by roughly 85%. For a topic producing 50 GB/day, that is the difference between 350 GB and 50 GB on disk.

For topics where some consumers need recent data in real time but you also want to keep a long-term archive, do not extend Kafka retention to 30 or 90 days. That is what object storage is for.

Tiered Storage

Tiered storage is a feature (available in Confluent Platform, Apache Kafka 3.6+, and Redpanda) that automatically moves older log segments from broker-local disk to object storage (S3, GCS, Azure Blob).

The economics are straightforward:

Storage TypeApproximate Cost (AWS, per GB/month)
EBS gp3$0.08
EBS io2$0.125
S3 Standard$0.023
S3 Infrequent Access$0.0125

Moving cold data from EBS to S3 reduces your storage cost by 3-6x. The tradeoff is that reading old data from S3 is slower than reading from local disk, but consumers that need old data typically tolerate higher latency.

If you are running a managed platform like Streamkap, tiered storage is handled for you. You get the cost benefit without needing to configure broker storage policies, object storage buckets, or lifecycle rules.

Managed vs. Self-Hosted: The Real Cost Comparison

The monthly cloud bill for your Kafka brokers is the most visible cost, but it is not the largest. The largest cost is almost always the engineering time required to keep the cluster healthy.

Hidden Costs of Self-Hosted Kafka

Here is what the cloud bill does not show:

  • Upgrades. Kafka releases security patches and bug fixes regularly. Rolling upgrades across a multi-broker cluster require planning, testing, and execution. Budget 4-8 hours of engineering time per upgrade, and you should be upgrading at least quarterly.
  • Capacity planning. Predicting when you need to add brokers, expand disk, or rebalance partitions requires ongoing monitoring and analysis. Getting this wrong means either wasted resources or an emergency scramble.
  • On-call. Kafka clusters need 24/7 monitoring. A broker going down at 3 AM triggers a page. The engineer on call investigates, restarts the broker, waits for partition reassignment, and verifies data integrity. That is 2-4 hours of disrupted sleep.
  • Connector management. If you are running Kafka Connect for CDC, you need to manage connector configurations, handle task failures, monitor consumer lag, and troubleshoot deserialization errors. Each connector is another thing that can break.
  • Security and compliance. TLS certificates, SASL authentication, ACLs, encryption at rest, audit logging. Each of these requires setup and ongoing maintenance.

A Rough Cost Model

Let us compare a self-hosted Kafka cluster against a managed platform for a moderate workload: 50 MB/s sustained throughput, 20 topics, 3-day retention.

Self-hosted (AWS):

ItemMonthly Cost
3x r6i.2xlarge brokers (on-demand)$2,700
3x 1 TB gp3 EBS volumes$240
3x m6i.large for Kafka Connect$700
Data transfer (cross-AZ replication)$500
Engineering time (0.5 FTE at $180k/year)$7,500
Total~$11,640

That 0.5 FTE is conservative. It accounts for a senior engineer spending about half their time on Kafka operations: upgrades, monitoring, capacity planning, on-call, and connector management. For many teams, the actual time commitment is higher.

Managed platform (e.g., Streamkap):

A managed CDC and streaming platform typically charges based on data volume or connector count. For this workload, monthly costs are generally in the $2,000-$5,000 range, depending on the specific platform and tier.

The managed option often comes in at 30-60% less than self-hosted once you factor in engineering time. And the comparison gets more favorable for the managed option as your pipeline grows, because the engineering time for self-hosted scales with cluster complexity while managed pricing scales more linearly with data volume.

Monitoring and Controlling Spend

Cost optimization is not a one-time project. It requires ongoing visibility into where your money goes and the ability to catch runaway costs before they hit the monthly bill.

Key Metrics for Cost Visibility

Set up dashboards that track:

  • Broker disk usage by topic. Identify which topics consume the most storage. Often a small number of high-volume topics account for 80% of disk usage.
  • Partition count by topic. Flag topics with high partition counts but low throughput.
  • Consumer lag by consumer group. Persistent lag means you might need more consumer instances (additional cost) or your processing logic needs optimization (engineering time).
  • Network throughput by broker. Uneven traffic across brokers indicates a partition imbalance that wastes capacity on underutilized brokers.
  • Compression ratio by topic. Topics with low compression ratios might benefit from a different codec or payload format.

Cost Alerts

Set alerts for:

  • Disk usage exceeding 70% on any broker. At this point, you need to either add storage, reduce retention, or add brokers. Acting early avoids emergency scaling at premium on-demand prices.
  • Topic creation. Every new topic consumes resources. Require team leads to approve new topics and specify a retention policy and partition count based on expected throughput.
  • Unused consumer groups. Consumer groups that are registered but not actively consuming waste partition assignment metadata and can slow down rebalances. Clean these up regularly.

Regular Cost Reviews

Schedule a monthly or quarterly review of your streaming infrastructure costs. Walk through:

  1. Top 10 topics by storage. Are retention settings still appropriate? Can any topics be compressed more aggressively?
  2. Broker utilization. Are any brokers consistently below 30% CPU or network? Consider consolidating or downsizing.
  3. Connector health. Are any connectors frequently restarting? Unstable connectors cause duplicate processing and wasted compute.
  4. Engineering time spent on operations. Track this honestly. If your team is spending more than a few hours per week on Kafka operations, the case for a managed platform gets stronger.

Putting It All Together

Here is a prioritized checklist for optimizing your streaming pipeline costs, ordered by impact and effort:

  1. Enable compression (ZSTD or LZ4) on all producers. High impact, low effort. Do this first.
  2. Audit retention policies. Reduce retention on topics where consumers do not need 7 days of history. High impact, low effort.
  3. Audit partition counts. Identify over-partitioned topics and consolidate where possible. Medium impact, medium effort.
  4. Right-size broker instances. Collect utilization metrics and downsize if appropriate. High impact, medium effort.
  5. Enable tiered storage. Move cold data to object storage. High impact for long-retention topics, medium effort.
  6. Evaluate managed platforms. Calculate your true total cost of ownership including engineering time. If self-hosted TCO exceeds what a managed platform like Streamkap charges, the switch pays for itself.
  7. Set up cost monitoring dashboards and alerts. Prevent cost creep by catching problems early. Medium impact, low ongoing effort.

Streaming pipelines do not have to be expensive. The teams that spend the least per GB of throughput are the ones that measure continuously, size based on evidence rather than guesswork, and make deliberate tradeoffs between cost, latency, and durability. Start with the low-hanging fruit, measure the results, and iterate.