A supervised machine learning task and family of statistical methods that predict a continuous numerical output from input features. Regression models estimate relationships such as how changes in media spend predict changes in conversion volume, how product attributes predict price, and how engagement signals predict lifetime value, making them foundational to media mix modeling, forecasting, and pricing analytics.
Also known as regression analysis, linear regression, predictive regression
Regression predicts a continuous numerical target variable from one or more input features. The simplest form, linear regression, models the target as a weighted sum of the input features plus a bias term: y = w1*x1 + w2*x2 + … + wn*xn + b. The weights are estimated from training data by minimizing the sum of squared errors between the model’s predictions and the observed target values. These weights are the model’s learned coefficients, and they directly quantify the estimated relationship between each feature and the target: a coefficient of 0.8 on weekly TV spend means the model estimates that a one-unit increase in TV spend is associated with a 0.8-unit increase in the target (holding all other variables constant).
The extension from linear regression to more powerful regression models follows a spectrum of increasing flexibility. Polynomial regression adds squared and higher-order terms to capture curved relationships. Ridge and Lasso regression add regularization penalties that shrink coefficients toward zero, reducing overfitting when many features are present relative to the number of observations. Gradient boosted trees and neural networks perform nonlinear regression that can capture complex interaction effects that linear models cannot represent. Each step up the complexity spectrum requires more data to train reliably and sacrifices interpretability for flexibility.
Regression is distinct from classification in its output: regression predicts a continuous value while classification predicts a category. The distinction matters for model selection, evaluation metrics, and interpretation. Regression models are evaluated using mean squared error, root mean squared error, mean absolute error, and R-squared. Classification models are evaluated using accuracy, precision, recall, and AUC. Some problems can be framed as either regression (predict exact conversion probability) or classification (predict high or low conversion category), and the choice affects how uncertainty is represented and how the model’s outputs are used in downstream decisions.
A working ad agency that builds or interprets media mix models, sales forecasts, or response curve analyses is working with regression at every turn. Media mix modeling is multivariate regression that estimates channel response coefficients. Sales forecasting is time-series regression that estimates trend, seasonality, and promotional lift. Attribution modeling is regression that distributes credit among touchpoints by estimating their independent contributions to conversion. Understanding regression at the conceptual level is not optional for agency practitioners who want to make informed decisions with these tools.
Regression coefficients in media mix models are the quantitative foundation for budget allocation recommendations. The coefficient on TV spend in a media mix regression model estimates how many additional conversions result from an additional dollar of TV investment, holding all other channels constant. This marginal effect is the response curve slope that budget optimization algorithms use to determine where the next dollar should be allocated. Coefficient reliability depends on variation in the historical spend data (channels with little variation in spend produce unreliable coefficient estimates), absence of multicollinearity, and sample size relative to the number of predictors. Agencies should report confidence intervals around media mix coefficients, not just point estimates, to communicate the uncertainty in the attribution estimates that drives the budget recommendation.
Log transformation of spend and response variables in media mix regression models captures diminishing returns. Marketing response functions typically exhibit diminishing returns: the first dollar of TV spend produces more incremental response than the millionth. Linear regression on raw spend values imposes a constant marginal return assumption that does not match this reality. Log-log regression, which transforms both the spend and response variables to log scale, produces coefficients that represent elasticities (percentage change in response per percentage change in spend) and automatically captures the diminishing returns structure of typical marketing response curves. Understanding this modeling choice is essential for correctly interpreting media mix coefficients and the budget optimizations built on them.
Regression to the mean in creative performance data distorts test-and-learn conclusions when tests are designed poorly. When an agency runs a test by identifying the best-performing ads from a prior period and comparing them to a challenger set, the prior-period top performers will tend to revert toward average performance in the test period due to regression to the mean: observed extreme outcomes reflect both genuine quality and random favorable noise, and the noise component is not repeated in the test period. Tests that compare a “historically best” creative against challengers will show the challenger performing better or equal not because the challenger is genuinely superior but because the selection criterion inflated the historical best’s apparent quality. Regression to the mean is avoided by using pre-defined selection criteria that do not involve the performance variable being tested.
An agency is building a sales forecasting model for a consumer goods client to predict weekly unit sales 4 weeks in advance, enabling proactive inventory and promotion planning. The training dataset contains 3 years of weekly sales data with features including current retail price, promotional status (binary), competitor promotional status (binary), temperature (for weather-sensitive SKUs), media spend by channel (TV, digital, in-store), and seasonal indicators. The agency trains a multiple regression model with log-transformed sales as the target. Initial model diagnostics reveal two issues: high multicollinearity between TV spend and digital spend, because the client historically scales both channels together in proportion, meaning the model cannot estimate their independent effects; and a non-linear relationship between temperature and sales for the seasonal SKU category that linear regression cannot capture. The agency addresses multicollinearity by replacing the separate channel spend variables with a total media spend variable plus a TV/digital mix ratio, which captures the same information with lower correlation between predictors. The non-linear temperature effect is captured by adding a squared temperature term, allowing the model to fit a curved relationship. The corrected model achieves R-squared of 0.84 on a held-out validation set of the most recent 6 months, predicting weekly sales within 12% mean absolute error. The 4-week-ahead forecast produced by the model is incorporated into the client’s inventory planning system, and the promotional lift coefficients from the regression are used to evaluate the ROI of proposed promotional periods before they are approved.
The generative AI foundations module covers regression analysis comprehensively including linear and nonlinear regression, coefficient interpretation, multicollinearity diagnostics, and how regression models underlie the media mix, forecasting, and attribution tools agencies use for client planning.