AI Glossary · Letter W

Weighted Average.

An average in which each value is multiplied by a weight that reflects its relative importance before the values are summed and divided by the total weight, producing a result that gives more influence to higher-weighted observations. Weighted averages are ubiquitous in marketing analytics: time-decay attribution applies weights that diminish for older touchpoints, ensemble models weight component predictions by their accuracy, and blended performance metrics combine channel results weighted by spend or impression share.

Also known as weighted mean, weighted sum, importance-weighted average

What it is

A working definition of weighted average.

A simple average treats all values as equally important: sum all values and divide by the count. A weighted average allows different values to contribute differently to the result by assigning a weight to each value. The weighted average is computed as the sum of each value multiplied by its weight, divided by the sum of all weights. If all weights are equal, the weighted average equals the simple average. If weights differ, values with higher weights pull the result toward their values more strongly than values with lower weights.

The choice of weights encodes a belief about relative importance. In time-decay attribution, touchpoints that occurred more recently before conversion are assigned higher weights than older touchpoints, reflecting the assumption that recent interactions had more influence on the conversion decision. In media mix modeling, channel performance estimates are weighted by confidence in the coefficient estimate, giving more influence to channel measurements derived from large samples with high spend variation. In ensemble modeling, component model predictions are weighted by their individual accuracy on the validation set, so more accurate models contribute more to the final prediction.

Weighted averages appear in signal processing and sequence modeling through attention mechanisms, which are a generalized form of weighted average: the transformer’s attention operation computes a weighted average of value vectors, where the weights (attention scores) are learned from the query-key similarity rather than specified in advance. This connection between attention and weighted averaging shows that the mechanism underlying modern language models is a direct generalization of the statistical concept, with the weights determined dynamically from the input rather than set by a fixed rule.

Why ad agencies care

Why weighted average choices determine the conclusions drawn from attribution models, ensemble predictions, and blended performance metrics.

A working ad agency that builds attribution models, blends performance metrics across channels, or evaluates ensemble AI system outputs is making weighted average decisions constantly. The choice of weights is where analytical judgment enters quantitative systems: it encodes assumptions about what matters more and what matters less, and those assumptions determine the strategic conclusions that follow. Understanding when weighted averages are applied and what the weights assume enables practitioners to evaluate whether the results are based on defensible choices or arbitrary defaults.

Time-decay attribution weights determine how much credit is redistributed from final touchpoints to earlier touchpoints, directly affecting channel budget recommendations. A time-decay attribution model with a half-life of 7 days assigns a touchpoint 7 days before conversion half the credit of the converting touchpoint; a touchpoint 14 days before receives one quarter. A half-life of 1 day concentrates credit heavily on the last 1 to 2 touchpoints and closely resembles last-touch attribution. A half-life of 30 days distributes credit much more evenly across a long journey. The choice of decay rate is a modeling assumption that has major consequences for which channels appear to be the most valuable and how much their budgets should grow or shrink. Agencies should present time-decay attribution results alongside the specific decay parameter used and sensitivity analysis showing how the channel credit distribution changes as the decay rate varies.

Impression-weighted versus spend-weighted blended CPM calculations produce meaningfully different benchmarks that affect campaign evaluation. A blended CPM across multiple channels is a weighted average of channel-specific CPMs. Weighting by spend (how many dollars were spent in each channel) emphasizes channels where investment is largest. Weighting by impressions emphasizes channels that delivered the most reach per dollar. For a client with heavy TV investment and lighter digital, spend-weighting produces a blended CPM dominated by TV rates while impression-weighting produces a blended CPM closer to the digital CPM. Specifying clearly which weighting is used when reporting blended metrics prevents the reporting surface from being interpreted as implying a weighting that was not actually applied.

Ensemble model predictions weighted by component accuracy outperform simple averaging when individual model performance varies substantially across segments. An ensemble churn model that combines a gradient boosted tree, a logistic regression, and a neural network using simple equal weights averages out the strengths of each component. If the gradient boosted tree substantially outperforms the others on high-tenure customers while the neural network is more accurate on new customers, equal weighting underutilizes the best model for each segment. Segment-specific weighted averaging, with weights calibrated to each model’s validation accuracy in each segment, captures the performance differences and produces a blended prediction that consistently applies the most accurate model with the highest weight for each input context.

In practice

What weighted average looks like inside a working ad agency.

An agency is developing a blended ROAS metric for a multichannel retail client that runs media across paid search, paid social, display, email, and affiliate channels. The client’s current reporting uses a simple average of channel ROAS figures, producing a blended ROAS of 3.8x that the client uses as the primary performance benchmark for quarterly business reviews. The agency identifies that the simple average is misleading because channel ROAS figures have very different underlying spend volumes: paid search accounts for 61% of total spend, email accounts for 8%, and affiliate accounts for 4%. The simple average gives equal weight to email’s ROAS of 14.2x and paid search’s ROAS of 3.1x despite email spending 7.5 times less. The spend-weighted blended ROAS is 3.4x, substantially below the simple-average 3.8x, because the high-ROAS low-spend channels receive less weight when their actual spend proportions are applied. The agency also computes an impression-weighted blended ROAS of 2.9x, where display’s large impression volume and moderate ROAS pulls the blended figure down. All three blended metrics are valid but measure different things: the spend-weighted figure represents the efficiency of the total investment portfolio, the impression-weighted figure represents the efficiency of reach, and the simple average represents the central tendency of channel-specific efficiency regardless of scale. The agency presents all three figures in the quarterly review with clear labeling, recommends the spend-weighted ROAS as the primary portfolio efficiency metric, and notes that the gap between the simple average (3.8x) and the spend-weighted average (3.4x) has historically caused the client to overestimate portfolio efficiency by 12%.

Build the quantitative analytics foundations that reveal when weighted average choices drive strategic conclusions in marketing measurement through The Creative Cadence Workshop.

The generative AI foundations module covers weighted averages including time-decay weighting, ensemble prediction weighting, attention as weighted averaging, and how weight choice in blended metrics and attribution models determines the strategic recommendations that follow.