What are Apache Flink Agents?

Flink Agents are an open-source framework for building AI agents that run as Flink streaming jobs. They process events continuously, use LLMs for reasoning, call external tools, and benefit from Flink's exactly-once processing guarantees.

How do Flink Agents differ from LangChain or CrewAI?

LangChain and CrewAI are request-response frameworks - they run when triggered and stop when done. Flink Agents are event-driven and continuously running, processing streaming data with exactly-once guarantees and persistent state management.

What are Workflow Agents vs ReAct Agents in Flink?

Workflow Agents follow predefined steps in a deterministic sequence, ideal for structured processes. ReAct Agents autonomously decide which tools to use and when, reasoning through problems step by step - better for open-ended tasks.

<--- Back to all resources

AI & Agents

March 11, 2026

12 min read

Apache Flink Agents: Event-Driven AI Agents with Streaming Guarantees

Flink Agents bring exactly-once consistency to AI agent orchestration. Learn how event-driven streaming agents differ from batch-oriented frameworks like LangChain and CrewAI.

TL;DR: Apache Flink Agents combine the power of LLM-driven AI agents with Flink's streaming guarantees. Unlike batch-oriented agent frameworks, Flink Agents process events continuously with exactly-once semantics, built-in state management, and native streaming integration.

Why AI Agents Need Streaming

Most AI agent frameworks today are built around a simple pattern: receive a request, run some logic, call an LLM, return a response. This works well for chatbots and one-off tasks. But it falls apart when agents need to react to events in real time, maintain state across millions of concurrent sessions, or guarantee that no event is processed twice.

Consider a fraud detection agent that monitors payment transactions. In a batch-oriented framework, you would poll a database, pull recent transactions, run them through an LLM with tool access, and write results back. This introduces latency, creates gaps between polling intervals, and offers no built-in protection against duplicate processing if the agent crashes mid-batch.

The problem is not the LLM or the tools. The problem is the execution model. Batch-oriented agent frameworks treat data as something to fetch. Streaming frameworks treat data as something that arrives continuously. For agents that need to act on events as they happen, the streaming model is fundamentally better suited.

This is the gap that Apache Flink Agents fills.

What Are Flink Agents?

Flink Agents is an open-source framework for building AI agents that run as Apache Flink streaming jobs. Each agent is a Flink application that consumes events from streaming sources (Kafka topics, CDC streams, HTTP endpoints), processes them through LLM-powered reasoning, calls external tools, and produces output events or side effects.

The framework supports both Python and Java, making it accessible to data engineers and ML engineers alike. Because agents run as Flink jobs, they inherit all of Flink’s operational guarantees: exactly-once processing, automatic checkpointing, state management, and fault recovery.

At its core, a Flink Agent is a stateful stream processor where the processing logic includes LLM calls and tool invocations. This is a significant departure from traditional agent frameworks, which are stateless functions that run on demand.

Architecture Overview

A Flink Agent consists of several components working together:

Event source - A Flink source connector that reads from Kafka, a CDC stream, an HTTP endpoint, or any supported input. Events trigger agent execution.
Agent runtime - The core processing logic that receives events, maintains conversation history, calls LLMs, and executes tools. This runs inside Flink’s task managers.
State backend - Flink’s state management layer (typically RocksDB) stores agent state: conversation history, intermediate results, tool call records, and any domain-specific data the agent needs to remember.
Tool registry - A set of callable tools the agent can invoke. These can be REST APIs, database queries, other Flink jobs, or custom functions.
Output sink - Where the agent writes its results. This could be a Kafka topic, a database, an API call, or a downstream Flink pipeline.

The key insight is that the entire agent lifecycle, from event ingestion through LLM reasoning to tool execution and output, happens within Flink’s processing framework. There is no external orchestrator. Flink itself is the orchestrator.

Workflow Agents vs ReAct Agents

Flink Agents supports two distinct agent types, each suited to different problem shapes.

Workflow Agents

Workflow Agents follow a predefined sequence of steps. You define the order of operations at design time: read event, extract fields, call LLM for classification, invoke tool A, then tool B, write result. The agent executes these steps deterministically for every event.

This pattern works well for structured processes where the logic is known in advance. Examples include document processing pipelines, data enrichment workflows, and compliance checks. The deterministic nature of Workflow Agents makes them easier to test, debug, and audit.

Event -> Extract -> Classify (LLM) -> Enrich (Tool) -> Validate (Tool) -> Output

ReAct Agents

ReAct Agents (Reasoning + Acting) are autonomous. Given an event and a set of available tools, the agent reasons about what to do next, executes a tool, observes the result, and decides the next step. The LLM drives the decision-making loop, and the agent continues until it reaches a conclusion or hits a configured step limit.

This pattern suits open-ended tasks where the right sequence of actions depends on intermediate results. A customer service agent that needs to look up order history, check inventory, and decide whether to issue a refund based on what it finds is a natural fit for the ReAct pattern.

Event -> Reason (LLM) -> Act (Tool) -> Observe -> Reason -> Act -> ... -> Output

Both agent types run as Flink jobs and benefit from the same streaming guarantees. The difference is in how processing logic is structured, not in how events are handled.

Exactly-Once Guarantees for Agents

This is where Flink Agents diverge most sharply from other agent frameworks. When a Flink Agent processes an event, the entire operation, including state updates, tool call records, and output messages, is covered by Flink’s exactly-once semantics.

What this means in practice:

No duplicate processing - If a Flink Agent crashes and restarts, it recovers from its last checkpoint. Events that were already processed are not reprocessed. Tool calls that were already made are not repeated.
Consistent state - The agent’s conversation history, intermediate results, and domain state are always consistent. There is no window where state can diverge from what was actually processed.
Atomic output - When using Flink’s two-phase commit sinks (Kafka, for example), output messages are written exactly once, even across failures.

For AI agents that make real-world decisions, such as approving transactions, triggering alerts, or modifying customer records, these guarantees are not optional. A fraud detection agent that processes the same transaction twice could block a legitimate card. A customer service agent that sends duplicate refunds creates financial exposure. Exactly-once processing eliminates these failure modes.

State Management and Checkpointing

Flink Agents store their state in Flink’s managed state backends. This includes:

Conversation history - The sequence of messages between the agent, the LLM, and tools. This allows agents to maintain context across multiple events from the same entity (user, device, account).
Tool call records - Which tools were called, with what parameters, and what they returned. Used for auditing and for recovery after failures.
Domain state - Any application-specific data the agent accumulates. A monitoring agent might track the last 100 readings from a sensor. A customer service agent might store open ticket status.

Flink periodically snapshots this state to durable storage (S3, HDFS, GCS) through its checkpointing mechanism. If the agent fails, it restarts from the most recent checkpoint with all state intact. The agent does not need custom recovery logic. Flink handles it automatically.

This is a major advantage over agent frameworks that store state in external databases or in-memory caches. With Flink, state is co-located with processing, which eliminates network round-trips for state access and guarantees consistency between state and processing progress.

Comparison with LangChain, CrewAI, and AutoGen

Capability	Flink Agents	LangChain	CrewAI	AutoGen
Execution model	Continuous streaming	Request-response	Request-response	Request-response
Processing guarantees	Exactly-once	None	None	None
State management	Built-in (Flink state)	External (Redis, DB)	External	In-memory
Fault recovery	Automatic (checkpoints)	Manual	Manual	Manual
Scaling	Flink parallelism	Manual horizontal	Manual	Manual
Event sources	Kafka, CDC, HTTP, etc.	Custom polling	Custom polling	Custom polling
Language support	Python, Java	Python	Python	Python
Best for	Streaming workloads	Chatbots, RAG	Multi-agent teams	Conversational agents

LangChain, CrewAI, and AutoGen are excellent for interactive applications where a user sends a message and waits for a response. They are not designed for workloads where millions of events arrive per second and each one needs to be processed reliably. Flink Agents fills this gap.

Use Cases

Real-Time Fraud Detection Agents

A Flink Agent consumes payment transaction events from Kafka, maintains per-card state (velocity, location history, spending patterns), and uses an LLM to evaluate suspicious patterns that rule-based systems miss. The agent can call external tools to check blocklists, verify merchant details, or trigger card holds. Exactly-once processing ensures every transaction is evaluated precisely once.

Streaming Customer Service Agents

Events from support channels (chat messages, emails, status changes) flow into a Flink Agent that maintains per-ticket conversation history. The agent classifies issues, looks up relevant account information through tool calls, and either resolves the issue automatically or routes it with full context to a human agent. State checkpointing means no conversation context is lost during deployments or failures.

IoT Monitoring and Alerting Agents

Sensor readings from thousands of devices stream into Flink. An agent per device type maintains rolling statistics, detects anomalies using LLM-based reasoning (for patterns too complex for static thresholds), and triggers alerts or automated responses through tool integrations. Flink’s windowing capabilities allow the agent to reason over time-based aggregations natively.

Getting Started with Flink Agents

Running Flink Agents in production requires operating Flink infrastructure: deploying clusters, configuring state backends, managing checkpoints, tuning parallelism, and handling upgrades. For teams already running Flink, adding agent workloads is a natural extension. For teams new to Flink, the operational overhead can be significant.

This is where Streamkap comes in. Streamkap’s managed streaming platform already runs Flink for real-time CDC and stream processing workloads. Support for Flink Agents is coming soon, which means teams will be able to deploy event-driven AI agents without provisioning or managing Flink infrastructure.

With Streamkap, you get:

Managed Flink runtime - No cluster management, no JVM tuning, no checkpoint configuration. Deploy your agent code and Streamkap handles the rest.
Native CDC integration - Connect agents directly to database change streams from PostgreSQL, MySQL, MongoDB, and DynamoDB. Your agents react to data changes as they happen.
Built-in monitoring - Track agent performance, LLM call latency, tool execution rates, and processing lag through Streamkap’s observability layer.
Scaling on demand - Streamkap adjusts Flink parallelism based on event volume. Your agents scale up during traffic spikes and scale down during quiet periods.

The combination of Flink Agents’ processing model and Streamkap’s managed infrastructure makes production-grade streaming AI agents accessible to teams of any size. Instead of spending months building and maintaining Flink clusters, you can focus on the agent logic that drives business value.

To learn more about running Flink workloads on Streamkap, visit our Flink processing documentation or start a free trial.

Products

Capabilities

Streamkap for...

Use Cases

By Destination

Compare

Learn

Company

Apache Flink Agents: Event-Driven AI Agents with Streaming Guarantees

Why AI Agents Need Streaming

What Are Flink Agents?

Architecture Overview

Workflow Agents vs ReAct Agents

Workflow Agents

ReAct Agents

Exactly-Once Guarantees for Agents

State Management and Checkpointing

Comparison with LangChain, CrewAI, and AutoGen

Use Cases

Real-Time Fraud Detection Agents

Streaming Customer Service Agents

IoT Monitoring and Alerting Agents

Getting Started with Flink Agents