<--- Back to all resources
Data Contracts for Streaming: Defining Producer-Consumer Agreements
Learn how to implement data contracts in streaming architectures - formal agreements between data producers and consumers that prevent breaking changes and ensure data quality.
In API development, the idea of a contract is second nature. You publish an OpenAPI specification, your consumers build against it, and nobody ships a breaking change without a versioned migration path. In data engineering, that discipline is conspicuously absent. A source database changes its schema. A column that was never null starts containing nulls. Event volume drops by 80% over a weekend and nobody notices until Monday’s dashboards are blank.
Data contracts exist to close that gap. They are formal, enforceable agreements between data producers and data consumers that define not just what the data looks like, but how it behaves, who is responsible for it, and what happens when something changes. In streaming architectures, where data flows continuously and latency budgets are measured in seconds, contracts are not a governance luxury - they are an operational necessity.
What a Data Contract Contains
A data contract is more than a schema definition. It is a complete specification that covers every dimension of the producer-consumer relationship. A well-formed contract includes six components.
Schema definition specifies the exact structure of the data: field names, data types, whether each field is required or optional, and any constraints (unique, non-negative, valid enum values). This is the most familiar part of a contract, and the part most teams already have in some form.
Quality rules define the expected quality characteristics of the data. These are measurable assertions: the email field must be non-null in at least 99% of records, the order_total must be a positive number, the created_at timestamp must not be in the future. Quality rules transform vague expectations (“the data should be good”) into concrete, testable conditions.
SLAs (Service Level Agreements) specify the operational guarantees the producer commits to. Typical SLAs include latency (events arrive within 30 seconds of occurring), availability (the stream is available 99.9% of the time), and throughput (the stream produces between 1,000 and 50,000 events per hour during business hours).
Ownership identifies the team and individuals responsible for the data. This includes the producing team, an on-call contact, an escalation path, and the communication channel (Slack, email, PagerDuty) where issues are reported.
Semantic descriptions explain what the data means. A field named status could mean anything - a semantic description clarifies that it represents the fulfillment status of an order, with allowed values of pending, processing, shipped, delivered, and cancelled. Semantics prevent the misinterpretation that causes subtle, hard-to-detect errors in downstream logic.
Change management rules define how the contract itself evolves. They specify the process for requesting changes, the required notice period, backward compatibility requirements, and the versioning scheme. This is the component that prevents breaking changes from being deployed without coordination.
Data Contracts vs Schemas vs Documentation
These three concepts are related but fundamentally different in scope and enforceability.
A schema defines structure. It tells you that the orders table has a customer_id column of type INTEGER and an order_total column of type DECIMAL(10,2). Schemas are necessary but insufficient - they say nothing about quality, freshness, ownership, or change processes.
Documentation describes what the data is and how to use it. It might explain that customer_id references the customers table, that order_total includes tax but excludes shipping, and that the table is updated via CDC from the production PostgreSQL database. Documentation is valuable but unenforceable - there is no mechanism to prevent someone from violating a statement in a wiki page.
A data contract combines the structural precision of a schema with the contextual richness of documentation and adds enforceability. Quality rules are not suggestions - they are checked on every record. SLAs are not aspirations - they are monitored and alerted on. Change management rules are not guidelines - they are enforced in CI/CD. The contract is the authoritative specification, and violations are detected and acted upon automatically.
In batch pipelines, you have natural checkpoints where quality can be verified after the fact. In streaming, data flows continuously with no convenient pause point. Contracts must be enforced inline, on every event, as it passes through the pipeline - a fundamentally harder problem that requires a more rigorous specification than a schema file and a Confluence page.
Contract Enforcement
A contract that is not enforced is just a document. The value of data contracts comes from the enforcement layer - the infrastructure that validates every event against the contract’s rules and takes action when violations occur.
Schema Registry
The schema registry is the enforcement mechanism for the structural component of the contract. Every event is validated against the registered schema. If a producer emits an event with an unknown field, omits a required field, or sends a value of the wrong type, the registry rejects the event before it reaches any consumer. In Kafka-based architectures, events are serialized using Avro, Protobuf, or JSON Schema, and the registry validates compatibility on every write.
Validation Rules
Quality rules are enforced by a validation layer that evaluates each event (or each micro-batch) against the contract’s quality assertions. This can be implemented as a stream processing step:
quality_rules:
- field: email
rule: not_null
threshold: 0.99 # 99% of records must have non-null email
- field: order_total
rule: positive
threshold: 1.0 # 100% of records must have positive order_total
- field: status
rule: enum
allowed: [pending, processing, shipped, delivered, cancelled]
threshold: 1.0
When a rule is violated beyond its threshold, the system can take one of several actions depending on the contract’s severity configuration: log a warning, send an alert, quarantine the offending records, or halt the pipeline entirely.
Monitoring and Alerting
SLA enforcement requires continuous monitoring. The pipeline must track latency (time between event creation and event availability to consumers), throughput (events per second/minute/hour), and availability (percentage of time the stream is producing data). When any SLA metric crosses a threshold, the contract’s escalation path is triggered.
Streamkap provides built-in monitoring for freshness and schema compliance, which forms the enforcement backbone that data contracts require. When a source stops producing events or when schema changes are detected, the platform alerts the responsible team automatically.
Ownership Model
The most important organizational decision in data contracts is ownership. The principle is straightforward: producers own their contracts. The team that produces data is responsible for its quality, its freshness, its schema stability, and its SLA compliance.
This is a departure from how many organizations operate today, where data quality is treated as the data engineering team’s problem. That model does not scale. The data engineering team does not control the source systems. They cannot prevent a developer from adding a nullable column or deploying a bug that corrupts order totals. They can only detect the damage after it has already reached downstream consumers.
Producer ownership means that the team shipping the orders service is also responsible for the orders data contract. They define the schema, commit to quality rules, and agree to SLAs. When they need to make a change, they follow the contract’s change management process. Consumers define their requirements (“we need order_total to be non-null and fresh within 60 seconds”), but producers are accountable for meeting those requirements.
Contracts are not unilateral declarations - they are negotiated. A consumer might request a stricter null rate or a tighter latency SLA. The producer evaluates feasibility and either agrees, proposes an alternative, or explains why the requirement cannot be met. The resulting contract reflects a realistic agreement that both sides commit to upholding.
Implementing Contracts in Practice
A data contract is typically expressed as a YAML or JSON file that lives alongside the producing service’s code, versioned in the same repository.
# contracts/orders.v2.yaml
contract:
name: orders
version: 2
owner:
team: commerce-platform
contact: commerce-oncall@company.com
slack: "#commerce-data"
schema:
type: record
fields:
- name: order_id
type: string
required: true
description: "UUID, globally unique order identifier"
- name: customer_id
type: integer
required: true
description: "References customers.id"
- name: order_total
type: decimal
precision: 10
scale: 2
required: true
description: "Total amount in USD, includes tax, excludes shipping"
- name: status
type: string
required: true
enum: [pending, processing, shipped, delivered, cancelled]
description: "Current fulfillment status"
- name: created_at
type: timestamp
required: true
description: "UTC timestamp of order creation"
- name: updated_at
type: timestamp
required: true
description: "UTC timestamp of last status change"
quality:
rules:
- field: order_total
assertion: positive
threshold: 1.0
- field: email
assertion: not_null
threshold: 0.99
- field: created_at
assertion: not_future
threshold: 1.0
sla:
latency_seconds: 30
availability: 0.999
throughput:
min_events_per_hour: 500
max_events_per_hour: 100000
change_management:
compatibility: backward
notice_period_days: 14
deprecation_period_days: 90
CI/CD Validation
The contract file is validated in CI/CD just like application code. A pull request that modifies the contract triggers automated checks:
- Backward compatibility: Does the new schema version break existing consumers? (Removing a required field, narrowing a type, or renaming a field without an alias would fail this check.)
- Quality rule consistency: Are the quality thresholds achievable given historical data?
- SLA feasibility: Are the latency and throughput commitments realistic?
If any check fails, the PR is blocked until the issue is resolved. This prevents accidental breaking changes from reaching production.
Contract-Driven Schema Evolution
Data contracts formalize how schemas change over time. Without a contract, a developer runs an ALTER TABLE and hopes for the best. With a contract, the producer opens a pull request that modifies the contract file, CI validates backward compatibility, consumers are notified, and after the notice period the change is deployed. Downstream systems pick up the new schema version through the registry.
Backward compatibility is the default requirement - consumers built against version N can still process data under version N+1 without modification:
- Adding optional fields is always safe
- Removing fields requires a deprecation period
- Renaming fields requires an alias during the transition
- Widening types (integer to long, float to double) is safe
- Narrowing types is a breaking change and requires a major version bump
Quality SLAs in Contracts
Quality SLAs go beyond schema correctness. They define the operational behavior of the data stream across three dimensions.
Freshness is the maximum acceptable delay between an event occurring in the source system and that event being available to consumers. In a streaming context, freshness SLAs are typically measured in seconds or low minutes. A contract might specify that 99th-percentile latency must not exceed 30 seconds.
Completeness defines the acceptable rate of missing or null values for each field. A contract might allow up to 1% null values in the email field (reflecting records where the customer did not provide an email) but require 100% completeness for order_id and order_total.
Volume expectations set bounds on the expected throughput. If the orders stream typically produces 10,000 events per hour during business hours and the volume drops to 100, something is wrong - even if every individual event is structurally valid and high quality. Volume anomaly detection catches problems that per-record validation cannot.
Practical Example: Orders Service Contract
Consider a commerce team that produces an orders stream consumed by three downstream teams: analytics (for dashboards and reporting), machine learning (for demand forecasting), and finance (for revenue reconciliation).
Without a contract, each consumer team has made silent assumptions. Analytics assumes order_total is always positive. ML assumes created_at is within the last 30 days. Finance assumes the stream delivers within 60 seconds. When the commerce team deploys a bug that produces negative order_total values, all three consumers are affected differently - and none discover the problem at the same time.
With a contract, the quality rules catch the negative order_total immediately. The validation layer quarantines offending records and alerts the commerce team’s on-call engineer. Downstream consumers never see bad data. The fix is deployed, quarantined records are reprocessed, and the incident closes within minutes rather than days.
Organizational Adoption
Data contracts are as much an organizational change as a technical one. Trying to implement contracts across every data source simultaneously is a recipe for failure. Start small and expand incrementally.
Start with your most critical data source. Identify the one stream that causes the most pain when it breaks - the one that triggers the most on-call incidents. Write a contract for that stream first.
Make the contract enforceable from day one. A contract that is only checked manually is not a contract - it is a wish. Wire the validation into your pipeline so violations are caught in real time.
Measure the impact. Track data quality incidents before and after the contract is in place. These metrics justify expanding contracts to additional data sources.
Expand incrementally. Once the first contract is working, add contracts for the next two or three most critical streams. Let teams see the value before mandating adoption. Within six to twelve months, teams will start requesting contracts for their own data - not because they were told to, but because they have seen the results.
Invest in tooling. Contract management benefits from dedicated infrastructure - a registry for versioning contracts, a validation engine for real-time enforcement, and a monitoring layer for SLA compliance. Platforms like Streamkap that include built-in schema management, validation, and freshness monitoring provide the enforcement layer that makes contracts practical rather than theoretical.
Data contracts are not a new concept, but their adoption in streaming architectures is accelerating as organizations recognize that real-time data demands real-time quality guarantees. The teams that invest in contracts now will spend less time fighting data fires and more time building the systems that actually move the business forward.