AI Glossary · Letter E

Error Analysis.

A systematic examination of the specific cases where a model makes incorrect predictions, conducted to identify patterns in failures, diagnose their root causes, and prioritize the interventions that will improve model quality. Error analysis is the primary diagnostic method for understanding why a deployed model underperforms expectations and what specifically needs to change.

Also known as failure analysis, model error review, prediction error analysis

What it is

A working definition of error analysis.

Error analysis begins where aggregate accuracy metrics end. A model with 85% accuracy has a 15% error rate, but that number reveals nothing about which errors are being made, why, or whether they are fixable. Error analysis examines the failing examples directly: looking for patterns in the inputs the model gets wrong, categorizing error types, measuring whether errors are concentrated in specific input categories or distributed randomly, and tracing systematic errors back to their likely causes in the training data or model architecture.

A structured error analysis typically involves sampling a set of incorrect predictions and categorizing them into error types manually. Common categories include labeling errors in the training data, inputs from a distribution the training data did not cover, inputs that are inherently ambiguous and difficult for any model to classify correctly, and inputs that expose a systematic blind spot in the model’s learned representation. Each category points to a different intervention: fixing labeling errors requires relabeling, coverage gaps require collecting more examples, and systematic blind spots may require feature engineering or model architecture changes.

Error analysis is the link between model evaluation and model improvement. Without it, model improvement efforts are directed by intuition about what might be wrong rather than evidence about what is actually wrong. With it, the highest-value improvement actions, the ones that will produce the largest accuracy gain for the effort invested, become identifiable before any retraining is done.

Why ad agencies care

Why error analysis might matter more in agency work than in most industries.

Agency AI deployments serve clients who have specific, non-negotiable use cases where model errors have real costs: a lead scoring model that misqualifies a specific industry segment, a content classifier that consistently mishandles a particular content type, or a brand safety tool that systematically flags or approves the wrong content. A working ad agency that responds to these failures by retraining on more data without first analyzing the error pattern is likely to waste the retraining effort on examples that do not address the actual problem.

It converts client complaints into actionable diagnoses. When a client says “your model keeps getting these wrong,” error analysis provides the structured response: here are the specific cases, here is the pattern, here is why it is happening, and here is what we are going to do about it. This is a fundamentally different conversation from “we will improve the model,” and it is what distinguishes agencies that own their AI quality from agencies that operate it as a black box.

Training data quality is often the culprit. In the majority of real-world model failures, error analysis reveals that the root cause is not the model architecture or the training procedure but the training data: wrong labels, missing categories, or a distribution that does not match the deployment context. Knowing this in advance prevents agencies from investing in more complex models when the real fix is better data.

It protects against silent systematic failures. Some model errors are concentrated in specific input categories that happen to be rare in the validation set but common in production for specific clients. Error analysis that segments failures by input characteristics, rather than reporting only aggregate accuracy, surfaces these concentrated failures before they accumulate into a pattern the client notices on their own.

In practice

What error analysis looks like inside a working ad agency.

An agency deploys a content relevance classifier for a B2B technology client that screens inbound content submissions for a thought leadership platform. After three months, the client reports that the classifier is approving a higher-than-expected rate of vendor-promotional content that the editorial guidelines prohibit. The agency pulls the last 200 approved examples the client flagged as problematic and conducts a structured error analysis. The review reveals that 78% of the errors share a specific pattern: the content uses the editorial markers of thought leadership, such as research citations and neutral third-person framing, while embedding promotional claims in the final paragraph. The training data contained no examples of this pattern. The agency adds 150 labeled examples of this error type to the training set, retrains, and reduces the false approval rate on this category by 81% without degrading accuracy on other content types.

Build the diagnostic discipline that turns model failures into model improvements through The Creative Cadence Workshop.

The generative AI foundations module of the workshop covers how to evaluate AI tools honestly, including the error analysis practices that distinguish agencies that understand their models from agencies that just operate them.