AI Glossary · Letter F

Feature.

An individual measurable property or characteristic of an observation that is used as an input to a machine learning model. For agencies, features are what get built from raw client data to power predictive models: the specific signals the model uses to make its decisions, and the primary lever through which domain expertise is encoded into a machine learning system.

Also known as input variable, predictor, model input

What it is

A working definition of the feature.

A feature is any property of an observation that can be measured and represented numerically for use as model input. In a customer churn prediction model, features might include the number of days since the customer’s last purchase, the total number of orders placed in the past 90 days, the average order value, the number of support tickets opened in the past 30 days, and whether the customer has engaged with any email in the past two weeks. Each of these is a feature: a specific, measurable signal that the model uses to distinguish churning customers from retained ones.

Features can be derived from raw data through transformation. A raw timestamp becomes multiple features: day of week, hour of day, days since a reference event, and whether the timestamp falls within a sale period. A raw text field becomes features through natural language processing: sentiment score, topic category, entity mentions, and word count. Raw behavioral event logs become features through aggregation: count of actions in the past 7 days, time between first and second purchase, and ratio of product page views to cart additions. This process of constructing informative features from raw data is called feature engineering and is typically where the most experienced practitioners spend the most time.

Feature quality is the primary determinant of model quality for a given amount of training data. A model trained on poorly chosen features will underperform a model trained on well-chosen features even if the model architecture and training procedure are identical. This is why machine learning practitioners often say that the most important thing you can do to improve a model is to improve its features rather than to try a more complex architecture: more signal in the inputs produces better outputs regardless of what happens inside the model.

Why ad agencies care

Why features matter more in agency work than in most industries.

Every AI tool a working ad agency uses, whether it is a lead scoring model, a personalization engine, or a creative performance predictor, is fundamentally a function that maps input features to output predictions. Understanding what the features are, where they come from, and what they are measuring is the foundation for evaluating whether the tool is fit for purpose and diagnosing why it fails when it does.

Feature availability determines model feasibility before a single line of code is written. When a client asks whether their data can support a predictive churn model, the answer depends entirely on which features are available in the client’s data systems: do they have the behavioral signals that distinguish churning customers from retained ones? A data audit that catalogs available features against the feature requirements of the proposed model is the most important step in scoping any AI project, and it is the step that most agencies skip.

Feature staleness is a leading cause of model degradation. Features derived from behavioral data become stale as customer behavior changes. A feature capturing email engagement behavior becomes less predictive during periods when email marketing cadence changes. A feature derived from CRM data becomes unreliable when the CRM migration changes how records are structured. Agencies managing long-running predictive models need to monitor whether the features feeding those models are still being computed correctly and still carry the predictive signal they did when the model was trained.

Knowing which features drive predictions enables better client conversations. Feature importance analysis reveals which signals a model is relying on most heavily. When those signals are interpretable, such as “days since last purchase” and “number of support tickets,” the model’s behavior becomes explainable to clients in terms they understand. When those signals are proxies for protected characteristics, such as zip code or device type, feature importance analysis is how the problem gets discovered before it becomes a regulatory issue.

In practice

What feature looks like inside a working ad agency.

An agency is scoping a customer lifetime value prediction model for a specialty retailer. The initial project proposal assumes the model will use purchase history, browsing behavior, and email engagement as features. A data audit reveals that browsing behavior data was collected inconsistently across the client’s two website platforms during a migration 14 months ago and is unreliable for the period that would comprise half the training set. Email engagement data exists but is stored in a format that requires significant transformation before it can be used as model features. The data audit adds three weeks to the project timeline and results in a revised feature set that excludes the unreliable browsing data, uses a simpler email engagement feature that can be computed reliably from the available data, and adds three purchase pattern features that the audit identified as having high predicted importance based on domain knowledge. The model built on the revised feature set achieves a mean absolute error that is 22% lower than the initial prototype built without the data audit.

Build the data foundation that makes predictive models actually work through The Creative Cadence Workshop.

The generative AI foundations module of the workshop covers how to scope and audit the data and feature requirements of AI programs before committing to a build, so the project is grounded in what the client’s data can actually support.