AI Glossary · Letter D

Data Leakage.

The accidental inclusion of information in a model’s training data that would not actually be available at prediction time, inflating evaluation performance while guaranteeing underperformance in deployment. For agencies, data leakage explains some of the most expensive AI tool failures: models that seemed to work perfectly until they were deployed.

Also known as target leakage, data contamination, train-test contamination

What it is

A working definition of data leakage.

Data leakage occurs when training data contains signals that encode the answer the model is trying to predict. A churn model trained on data that includes “number of cancellation-related support tickets” as an input feature has leaked: that feature directly encodes the outcome. In deployment, when no cancellation ticket exists yet, the model loses its primary signal and performs far worse than its validation numbers suggested.

Temporal leakage is the subtler version: using future information to evaluate past predictions. If a time-series model is validated by training on all available data and testing on a randomly shuffled sample rather than a chronologically held-out period, the test set contains data from after the training period. The model has effectively been evaluated on its ability to predict the past with knowledge of the future.

Leakage is dangerous specifically because it makes models appear to perform well during evaluation. Detection requires understanding the data generation process, knowing precisely what would be available at the actual prediction moment, and being suspicious of validation accuracy that seems too high for the difficulty of the task.

Why ad agencies care

Why data leakage might matter more in agency work than in most industries.

Agencies often receive AI tools from vendors claiming strong validation performance. Understanding data leakage helps agencies ask the right questions about how that performance was measured and identify when evaluation metrics that look impressive should prompt scrutiny rather than confidence.

Vendor-reported accuracy needs context. A lead scoring tool claiming 90% accuracy needs to be validated on a clean temporal holdout using only features genuinely available at lead-score time. Evaluation on shuffled historical data without temporal splits almost always overstates real-world performance, sometimes dramatically.

Leakage hides in feature engineering. Features engineered from raw data can inadvertently encode the label. An “average days since last purchase” feature computed across the entire dataset including the test period gives the model future information about test examples. Feature engineering and evaluation need to respect the temporal structure of the data and the availability of each feature at prediction time.

The first sign is unexpected post-deployment underperformance. A model that performs well in validation and then systematically underperforms in production is showing the signature of leakage. The agency that understands this diagnoses correctly rather than assuming the model was generically poorly built.

In practice

What data leakage looks like inside a working ad agency.

An agency validates a click-through rate prediction model and reports 87% classification accuracy to the client. After deployment, the model’s lift over random targeting is only 4%, against a projected 22%. Investigation shows the validation split shuffled examples randomly across time, meaning the test set contained examples with temporal neighbors in the training set. Rebuilding the validation with a clean temporal split produces 63% accuracy, which sets a realistic projection of deployment performance. The agency presents the revised analysis, names the leakage as the root cause, and outlines the retraining approach.

Build the model literacy to evaluate AI tools on the metrics that actually matter through The Creative Cadence Workshop.

The governance and disclosure module of the workshop covers the internal standards your agency needs to evaluate AI tools honestly and communicate their real-world performance to clients.