← Back to blog

Marc Klefter

Domain-Aware AI Agents Need a System of Record — And Why Event Sourcing Is the Answer

Based on Axoniq’s webinar with Lead Product Architect Marc Klefter, this article explains why AI agents that can’t remember what happened in your business domain can’t be trusted — and how event sourcing creates a native system of record for domain-aware AI agents.

Published Jun 2, 2026

An AI agent, software that uses an LLM (Large Language Model) to plan and call tools, that places an order, reschedules a flight, or approves a refund, is making consequential decisions inside a real business domain. The problem? These agents can’t reliably tell you what they did yesterday, let alone what the business actually changed as a result. Their “memory” is a session log or a similarity search over conversation history which is useful, but neither canonical nor domain-aware.

A system of record for an AI agent is an unambiguous, single source of truth that captures what happened in a specific business domain, when it happened, and ideally why.

It is not the same as the generic “agent memory” shipped with most frameworks today, typically a vector database of past chats. It is domain-aware: grounded in the same business events your transactional systems already care about.

In a recent Axoniq webinar, Marc Klefter, Lead Product Architect at Axoniq, made the case that event sourcing is the natural pattern for this. Event sourcing already produces an immutable, sequenced record of every meaningful state change. With the right projections in place, that record becomes the episodic memory your agents need to be auditable, replayable, and explainable.

The Three Types of Agentic Memory

Building domain-aware AI agents means building an AI infrastructure that can remember what those agents did, and what changed in the business as a result. To see where current implementations fall short, it helps to name the layers. AI agent memory is typically discussed in three forms.

Session memory (short-term)

The technical log of a single agent session: prompts, model responses, tool calls. This layer matters because it is how an agent recovers from a crash or resumes mid-task. But it is narrow, short-lived, and described in technical rather than business terms.

Episodic memory (long-term, derived)

Human-readable highlights from session events. A “session timeout” gets re-described as “cart abandoned.” A sequence of tool calls becomes "order placed." Most implementations derive episodes from session logs, which means they inherit whatever the session log captured and lose whatever it didn't.

Semantic memory (long-term, inferred facts)

Durable facts inferred from patterns: “this customer’s shoe size is 42.” Useful for personalization and long-horizon reasoning, but a layer of inference removed from what actually happened.

Most agent frameworks, like Google ADK, for example, give you a clean way to push session events into a vector store and run keyword or similarity search over them. That is good enough for many use cases. But as Marc put it in the webinar:

“Most memory implementations take you quite far — but there’s a sweet spot in the middle that almost everyone misses.” - Marc Klefter, Lead Product Architect, Axoniq

That sweet spot is episodic memory treated as a first-class artifact, not summarized from technical logs, but explicitly modeled in domain terms. Which is exactly what event sourcing already produces. It’s important to note that More Data ≠ More Understanding, and having systems that give AI the context to reason is going to be more critical than ever, especially with new compliance obligations, like the EU AI Act going into effect by August 2026.

The Missing Piece: Domain Events as First-Class Episodic Memory

In domain-driven design, a domain event is a meaningful business fact: OrderInitiated, OrderItemAdded, OrderPlaced. It is not just a log of what the agent did, but it’s a statement of what changed in the business.

In the webinar, Marc used a useful visual: yellow stickies for session events (“tool called”), orange stickies for domain events (“Order Placed”). A tool call is the cause; the domain event is the downstream consequence in the system of record, and the orange stickies are what your business actually cares about. They carry semantics, they can be replayed, and they are modeled deliberately by humans alongside domain stakeholders, not summarized by an LLM after the fact.

This sequence of domain events is the system of record. It is canonical, unambiguous, and replayable.

Event sourcing is the pattern that makes this work. It stores every state change as an immutable, sequenced event in an event store. From that store you can:

Source past events to decide whether the next event is valid (e.g., can this order actually be placed?)
Replay events to reconstruct system state at any point in time
Project events into any shape downstream consumers (including agents) need

Most teams deciding when to event source do it for transactional reasons. But the same pattern, applied to an agentic workload, gives you native episodic memory as a by-product. You are not deriving episodes from sessions and hoping you captured the right details. The episodes are natively found within the event stream.

Architecting the Event-Sourced Memory Layer

An event-sourced memory layer is built from three components: an event store, projections, and an MCP interface that exposes those projections to the agent. Here is how this looks in practice on the Axoniq stack:

Axon Framework processes commands, validates them, and emits the corresponding domain events. Production teams running this in a commercial context use Axoniq Framework, which adds enterprise features and support.
Axon Server is a purpose-built event store. It holds the canonical, sequenced event log and powers decision-making and downstream projections.
Axoniq Insights is the analytical query layer. It ingests events from Axon Server and exposes both ad-hoc SQL and a conversational interface for natural-language queries.

Raw events are not what you hand to a model, projections are. The three patterns below cover most cases:

Ad-hoc analytical queries (Axoniq Insights): The agent asks in natural language (“how many bikes have been registered?”), Insights generates the SQL, runs it across the event history, and returns structured results. Best for open-ended, aggregational, or trend-style questions where you do not know the query shape in advance.
Application projections: Denormalized, application-specific views maintained within your application, built on Axoniq Framework.
Context graphs: Graph-structured projections that capture relationships between domain entities. Best when the domain is highly connected and the agent needs to traverse those relationships.

Each projection is typically surfaced to the agent as a Model Context Protocol (MCP) tool. The agent invokes them like any other tool call. The agent needs domain understanding and routing instructions to know which MCP tool fits which query. The memory layer, in practice, is tools + routing instructions + domain understanding, packaged together.

Staleness: a first-class concern

A production-grade memory layer is not just about what data the agent retrieves, it is about how confident the agent should be in its freshness. Projections are eventually consistent, but some lag more than others. Axoniq’s memory model exposes four guarantees the agent can reason about:

Least latency: Return whatever is available now; staleness is not a concern.
Max age: Return the result if it is within an acceptable lag window; flag it as stale if not.
Read your writes: Wait until a specific just-emitted event has been processed by the projection before returning, ensuring the agent observes the consequence of its own action.
Snapshot at: Point-in-time consistency for “what was the state at timestamp T.”

A bike-status lookup is effectively real-time. An Axon Insights query may be a few seconds behind, but covers a much wider analytical surface. The agent’s routing logic gets to weigh that trade-off explicitly, instead of guessing.

Why Domain-Aware AI Agents Matter: The Trust Gap in Enterprise AI

Domain-aware AI agents are moving into production inside finance, healthcare, logistics, government, and more. In these industries, decision traceability is not a nice-to-have, it is the price of being allowed to operate. Frameworks like SR 11-7 (US model risk management for banks), DORA (EU digital operational resilience for financial services), the EU AI Act, and HIPAA's audit requirements all assume you can show, after the fact, exactly what a system did and why. An AI agent without a system of record fails that bar on day one, it's a black box.

With a system of record in place, every decision is auditable and replayable, and the event stream becomes an AI audit trail by construction. You can reconstruct the exact event sequence that led to a refund, a trade, or a clinical recommendation, and you can replay it. That is the difference between "the agent did something" and "here is the causal chain that produced this outcome."

Combine session events and domain events in the same stream and you get intrinsic causality as a bonus: you always know which tool call produced which business outcome, in the right order, without correlation work. Replay, audit, and retention all happen in one pass.

Axoniq has been building this infrastructure for 15 years. Event sourcing wasn't originally designed for AI agents, but for systems that need to be honest about what happened. That turns out to be the same thing the agentic era needs, and it is the foundation of explainable AI infrastructure: agents whose decisions can be inspected, traced, and defended.

Recap and Next Steps

Domain-aware AI agents need a system of record, a canonical, replayable account of what happened in the business, not just what was said in a conversation. Event sourcing delivers that natively, and Axoniq makes it queryable by agents with the routing and freshness controls a production deployment requires.

Watch Marc's full webinar to see the live demo on the Axoniq stack, including the bike-rental agent calling Axoniq Insights and the application projections side by side.

Frequently Asked Questions about Domain-Aware AI Agents

Q: What is a system of record for AI agents?

A system of record for AI agents is an unambiguous, canonical source of truth that records what happened, when, and optionally why within a specific business domain. It gives agents reliable ground truth rather than approximate memory derived from session logs.

Q: How does event sourcing support agentic AI?

Event sourcing stores every state change as an immutable, sequenced domain event, creating a complete, replayable history. For an AI agent, that history becomes native episodic memory: the agent can query past events, understand current state, and reason about valid next actions, all from a single source of truth.

Q: What is the difference between session memory and episodic memory?

Session memory captures the technical detail of a single agent interaction — prompts, tool calls, responses. Episodic memory captures domain-meaningful events such as “cart abandoned” or “order placed.” Episodic memory is more durable, more business-meaningful, and more useful for long-term agent reasoning.

Q: What is MCP and how does it connect AI agents to memory?

MCP (Model Context Protocol) is a standard for giving AI agents structured, secure access to external systems. In an event-sourced memory layer, projections of domain events are exposed as MCP tools — letting agents query current state, run analytics, or look up domain history without direct database access.

Glossary of Terms

AI agent

A software system that uses an LLM to plan, call tools, and act on behalf of a user. In an enterprise context, agents make decisions inside business workflows — placing orders, scheduling appointments, approving exceptions — and therefore need reliable access to domain state.

Domain-aware

A property of an agent or memory layer whose representation of “what happened” matches the language and semantics of the business domain, not just the technical log of API calls.

Domain event

An immutable business fact, named in the language of the domain (e.g., OrderPlaced, BikeReturned). The atomic unit of an event-sourced system of record.

Episodic memory

The layer of agent memory that captures meaningful happenings over time. In an event-sourced architecture, the sequence of domain events is the episodic memory.

Event sourcing

An architectural pattern in which every state change is stored as an immutable, sequenced event in an event store. The current state is derived by replaying or projecting those events.

Model Context Protocol (MCP)

An open standard for exposing tools, data sources, and services to AI agents in a structured, secure way. Axoniq surfaces each memory projection as an MCP tool.

Projection

A read-optimized view derived from the event stream, shaped as a denormalized lookup, an analytical table, or a graph. Agents query projections, not raw events.

System of record

The canonical, unambiguous source of truth for what happened in a business domain. For agents, it is what enables auditable, replayable, explainable behavior.

Join the Thousands of Developers

Already Building with Axon in Open Source

Talk to us about LTS →

Join the Thousands of Developers

Already Building with Axon in Open Source

Talk to us about LTS →

Talk to us about LTS →

Join the Thousands of Developers

Already Building with Axon in Open Source

Talk to us about LTS →