AI Glossary · Letter D

Data Normalization.

The preprocessing step of rescaling numeric feature values to a common range or distribution before model training, ensuring that features with different natural scales contribute proportionally rather than by the size of their numbers. For agencies, normalization is one of the invisible preprocessing choices that determines whether a model learns the right patterns or is overwhelmed by the loudest numbers in the dataset.

Also known as feature scaling, data standardization, min-max scaling

What it is

A working definition of data normalization.

Machine learning models that compute distances between data points or update weights based on gradients are sensitive to the scale of their input features. A model trained on customer attributes that include age (ranging from 18 to 90) and annual spend (ranging from 100 to 500,000) will have the spend variable dominate its learning simply because its numerical range is far larger, regardless of whether spend is actually more predictive than age.

Normalization addresses this by rescaling features to a comparable range. Min-max scaling maps all values to a 0-to-1 range. Standardization subtracts the mean and divides by the standard deviation, producing a distribution centered at zero with unit variance. Log transformation compresses right-skewed distributions by converting the scale to a logarithmic one.

Not all algorithms require normalization. Tree-based models like decision trees and random forests are scale-invariant; they split on feature values rather than computing distances, so raw scale does not affect their behavior. Distance-based models like k-nearest neighbors and support vector machines, and gradient-based models like neural networks, are highly sensitive to scale and require normalization to perform reliably.

Why ad agencies care

Why data normalization might matter more in agency work than in most industries.

Agencies evaluating or building models on client data encounter normalization decisions constantly, often implicitly in the preprocessing steps of tools they use. Understanding when normalization is required and what it does helps prevent misapplication and misinterpretation of model outputs.

It affects which features appear important. In a model without normalization, features with large numerical ranges appear to have high importance even if they are not actually predictive. A feature importance report on un-normalized data may rank a large-scale variable as the primary driver of predictions when its apparent importance is a function of its scale, not its predictive content.

Preprocessing choices are modeling decisions. The sequence of normalization, imputation, and transformation applied before training is part of the model, not a neutral technical setup step. Agencies that treat preprocessing as configuration rather than modeling work may apply inappropriate transformations without understanding the downstream effects on model behavior.

The inverse transformation matters for output interpretation. If input features are normalized before training, the model’s outputs and feature importances are in the normalized scale. Presenting results to clients requires understanding what normalization was applied and converting back to the original scale correctly, which is easy to get wrong under deadline pressure.

In practice

What data normalization looks like inside a working ad agency.

An agency trains a customer value prediction model and the feature importance report shows “customer ID number” as the second most important predictor. Investigation reveals that no normalization was applied: customer IDs, which range from 1 to 500,000 because they were assigned sequentially since launch, are dominating the model because of their scale. The model is learning customer recency from the ID number, which also introduces data leakage. The fix requires both normalization and removing the ID from the feature set. Both problems were invisible until someone asked why a database artifact was appearing in the importance report.

Build the preprocessing judgment that keeps your models honest through The Creative Cadence Workshop.

The generative AI foundations module of the workshop covers how today’s models work, what they require from data, and how to choose between them for the real-world data realities agencies face with clients.

Learn about the workshop Back to letter D

Data Normalization.

A working definition of data normalization.

Why data normalization might matter more in agency work than in most industries.

What data normalization looks like inside a working ad agency.

Build the preprocessing judgment that keeps your models honest through The Creative Cadence Workshop.

Concepts in data normalization’s territory.