AI Glossary · Letter D

Data Streaming.

Processing data continuously as individual events arrive rather than collecting them into batches and processing periodically, enabling real-time analytics and decisions. For agencies, data streaming is what makes campaign systems that respond to audience behavior in the moment possible.

Also known as real-time data processing, event streaming, stream processing

What it is

A working definition of data streaming.

Data streaming processes events as they are generated rather than waiting for a batch to accumulate. A clickstream event fires when a user views a product page, and a streaming pipeline can update that user’s profile, trigger a retargeting signal, or modify a personalization model’s input within milliseconds. The same update in a batch system might take hours or a day, depending on the batch schedule.

Streaming platforms like Apache Kafka, Amazon Kinesis, and Google Pub/Sub handle the infrastructure for ingesting and routing high-volume event streams. Applications subscribe to these streams and process events according to their own logic. The result is a system architecture where multiple downstream processes can react to the same event simultaneously and independently.

Streaming adds complexity. Distributed streaming systems are harder to build, debug, and maintain than batch pipelines. Events can arrive out of order. Late-arriving data needs to be handled. The system needs to be fault-tolerant in a way that batch pipelines do not. Streaming is the right choice when the use case genuinely requires real-time latency; it is overengineering when a nightly batch would serve equally well.

Why ad agencies care

Why data streaming might matter more in agency work than in most industries.

Real-time personalization, dynamic pricing, in-session abandonment recovery, and live campaign optimization all require streaming data. Agencies building these capabilities for clients are building streaming systems, whether they know it or not. Understanding the architecture helps agencies scope these projects correctly, set accurate expectations, and evaluate vendor claims about “real-time” capabilities.

“Real-time” is a marketing term that needs interrogation. Many tools advertise real-time capabilities that operate on five-minute or fifteen-minute mini-batches rather than true event-by-event streaming. The distinction matters for use cases where the value of an action degrades quickly with latency, such as cart abandonment recovery or in-session offer personalization.

Streaming changes the economics of AI campaign activation. A personalization model that can incorporate a user’s last five minutes of browsing behavior performs differently from one that incorporates yesterday’s session. Streaming pipelines are what close the gap between behavioral signal and model input, and the gap matters more as personalization becomes more granular.

Operational complexity is a real project cost. Streaming infrastructure is a recurring engineering investment, not a one-time setup cost. Agencies proposing real-time personalization systems to clients need to scope the ongoing infrastructure and maintenance requirements honestly in their project modeling.

In practice

What data streaming looks like inside a working ad agency.

An agency builds an in-session personalization system for an e-commerce client. The system uses a streaming pipeline to capture page view events, enrich them with the user’s historical purchase and browse data from a feature store, and generate personalized product recommendations that refresh as the user navigates the site. The streaming architecture adds six weeks of infrastructure work to the project versus a batch equivalent, but it enables the in-session recommendation use case that the batch system could not support. The client’s session-to-purchase conversion rate on recommended items validates the additional investment.

Build AI campaign infrastructure that responds in the moment through The Creative Cadence Workshop.

The automations and agents module of the workshop teaches you how to build AI workflows that connect data to action fast enough to actually change what happens next for the customer.