AI Glossary · Letter F

Fairness Metrics.

Quantitative measures used to evaluate whether an AI system produces equitable outcomes across different demographic groups, including metrics like demographic parity, equalized odds, and calibration. For agencies, fairness metrics are how you detect and document whether your targeting, scoring, and personalization models treat different populations differently in ways that create legal, reputational, or ethical exposure.

Also known as algorithmic fairness measures, bias metrics, equity metrics

What it is

A working definition of fairness metrics.

Fairness metrics formalize the question “does this model treat different groups equitably?” into measurable quantities that can be computed from model outputs and demographic data. Different metrics capture different notions of fairness, and these notions are mathematically incompatible: it is provably impossible for a model to satisfy all common fairness definitions simultaneously unless the base rates of the outcome are equal across all groups. Practitioners must choose which fairness criterion matters most for their specific application, and that choice is a value judgment with real consequences.

Demographic parity requires that the proportion of positive predictions be equal across groups, regardless of actual outcome rates. A hiring model with demographic parity approves applicants at equal rates by group whether or not the underlying qualification rates differ. Equalized odds requires that both the true positive rate and false positive rate be equal across groups: the model must be equally accurate in both directions for all groups. Calibration requires that predicted probabilities reflect actual outcome rates within each group: if the model predicts 70% probability of conversion for a segment, roughly 70% of people in that segment should actually convert regardless of group membership.

These distinctions matter operationally because different fairness criteria produce different model behaviors. Optimizing for demographic parity can require predicting positive outcomes for some individuals the model believes are unlikely to convert, which conflicts with accuracy objectives. Optimizing for equalized odds typically requires separate calibration by group, which introduces its own complexity. There is no fairness criterion that is right for every application, and agencies using AI systems that affect individuals differently by group need to be explicit about which criterion governs each system and why.

Why ad agencies care

Why fairness metrics matter more in agency work than in most industries.

Agency AI systems routinely make decisions that affect individuals differently based on their demographic characteristics: who sees which ad, who gets scored as a high-value lead, who receives personalized offers. A working ad agency that does not measure the fairness properties of these systems does not know whether they are treating different groups equitably. Discovering disparity after a campaign has run, or after a regulator asks, is a substantially worse outcome than catching it before deployment.

Targeting models encode historical data that may encode historical bias. A lookalike audience model trained on a client’s existing customer base learns to find people who resemble that base. If the existing customer base reflects historical targeting decisions that underserved certain demographic groups, the lookalike model will perpetuate and potentially amplify those patterns. Fairness metrics surface this perpetuation before it becomes a campaign decision. Without measurement, the bias is invisible.

Lead scoring models can discriminate without using protected attributes directly. Zip code, purchase history, and device type can serve as proxies for race, income, or age in ways that a model learns without being explicitly programmed to. A lead scoring system trained on outcomes from a market where access was historically unequal will reflect that inequality in its scores. Agencies building or deploying lead scoring for clients in regulated verticals, including financial services, insurance, and employment, have legal exposure when these proxy discriminations are present regardless of intent.

Regulatory scrutiny is expanding rapidly. The EU AI Act classifies certain advertising and credit-related AI applications as high-risk and requires documented fairness assessments before deployment. State-level AI bias legislation in the US is advancing across multiple jurisdictions. Agencies that build fairness measurement into their standard model evaluation practice now are ahead of requirements that will soon be mandatory; agencies that do not are accumulating technical debt that will be expensive to remediate under time pressure.

In practice

What fairness metrics looks like inside a working ad agency.

An agency builds a propensity-to-convert model for a financial services client that scores consumer leads for a home equity product. Before deployment, the agency runs a fairness audit using zip code as a proxy for racial composition, grouping scored leads into majority-minority and majority-white geographic areas. The false negative rate, the rate at which genuinely qualified leads are scored as low-priority, is 18% in majority-minority areas and 9% in majority-white areas. The agency reports the disparity to the client, identifies that the model learned the pattern from historical approval data that reflected prior underwriting bias, and retrains with equalized odds as the optimization target after collecting supplementary features that are more directly predictive of creditworthiness. The retrained model reduces the false negative rate disparity to 3 percentage points, which the client’s compliance team documents as acceptable under their fair lending policy.

Build AI programs that can withstand fairness scrutiny before a regulator applies it through The Creative Cadence Workshop.

The generative AI foundations module of the workshop covers responsible AI practices including fairness measurement, so agency AI deployments are built with the audit trail clients in regulated industries will eventually need.