AI Glossary · Letter G

Grid Search.

A hyperparameter optimization method that evaluates every combination of specified parameter values by training and validating a model for each combination in the defined grid. Grid search is the baseline hyperparameter search method against which more efficient approaches like random search and Bayesian optimization are compared, and understanding its limitations helps agencies choose search methods appropriate to the scale and complexity of their model development work.

Also known as exhaustive grid search, parameter grid, hyperparameter grid

What it is

A working definition of grid search.

Grid search defines a set of candidate values for each hyperparameter, then exhaustively trains and evaluates a model for every combination of those values. For a model with three hyperparameters and five candidate values per hyperparameter, grid search trains and evaluates 5 x 5 x 5 = 125 models. The best-performing combination on the validation set is selected as the final hyperparameter configuration. Grid search is conceptually simple, fully parallelizable across the combinations, and guaranteed to find the best configuration within the specified grid, which is why it remains widely used despite its inefficiency relative to more sophisticated methods.

The fundamental limitation of grid search is the curse of dimensionality: the number of combinations grows exponentially with the number of hyperparameters. Adding a fourth hyperparameter with five candidate values increases the number of models from 125 to 625. Adding a fifth takes it to 3,125. For models with 10 or more hyperparameters, grid search over even a coarse grid becomes computationally infeasible. Additionally, grid search allocates equal search budget to every region of the hyperparameter space regardless of whether that region is promising, which means a large fraction of the budget is spent evaluating configurations that are obviously poor once the first few results come in. Random search and Bayesian optimization are more efficient because they concentrate search budget on informative configurations rather than evaluating the full combinatorial product.

Despite its inefficiency, grid search is appropriate when the hyperparameter space is small, the training cost per model is low, and the practitioner wants the certainty of having evaluated all combinations in the grid. For simple models like logistic regression with two or three hyperparameters, grid search is often the right choice. For deep neural networks with many architectural and training hyperparameters, the computational cost of grid search is prohibitive and Bayesian optimization or evolutionary hyperparameter search is more appropriate. The practical discipline is matching the search method to the complexity and cost of the model, rather than defaulting to grid search for all models or abandoning it entirely in favor of more sophisticated methods.

Why ad agencies care

Why grid search might matter more in agency work than in most industries.

Grid search is the entry point for hyperparameter optimization in most machine learning toolkits and is likely the method already being used in agency model development pipelines that have not been explicitly configured for more efficient search. A working ad agency that understands grid search’s limitations can make informed decisions about when to use it, when to switch to more efficient methods, and how to set up grids that provide useful coverage without wasting computational budget.

Grid search on a few key hyperparameters is often sufficient for simpler models. For logistic regression lead scoring models with regularization strength as the primary hyperparameter, a grid of 8-10 regularization values covers the relevant range and produces a well-configured model with minimal computational investment. The efficiency loss of grid search relative to Bayesian optimization is only significant when the search space is large, training is expensive, and many evaluations are needed. For simple models, the clarity and determinism of grid search often outweigh the theoretical efficiency gains of more complex methods.

Grid design quality determines grid search quality. A grid search is only as good as the values included in the grid. If the optimal learning rate is 3e-4 and the grid includes only 1e-3 and 1e-4, grid search will find the best value in the grid but miss the global optimum. Designing grids that cover the relevant range, use log-scale spacing for parameters that span orders of magnitude, and include enough points to resolve the performance landscape is a practical skill that determines whether grid search produces useful results or systematically misses the best configurations.

Understanding grid search sets the baseline for evaluating optimization tools. When a vendor or platform claims that their AI-powered optimization found a configuration better than what the team had been using, knowing whether the comparison is against grid search, random search, or no search at all is necessary context for evaluating the claim. If the prior configuration was set manually or via a coarse grid, “better than baseline” is a weak claim. If the prior configuration was found via thorough Bayesian optimization, the improvement is more meaningful. Knowing the hierarchy of search methods helps agencies correctly interpret optimization performance claims.

In practice

What grid search looks like inside a working ad agency.

An agency builds a customer reactivation propensity model for a subscription retail client using a gradient boosted tree. The model has four primary hyperparameters: learning rate, max tree depth, subsampling rate, and number of estimators. Rather than defaulting to the library’s built-in defaults, the agency designs a grid: learning rate in [0.01, 0.05, 0.1, 0.2], max depth in [3, 5, 7], subsampling in [0.7, 0.85, 1.0], and number of estimators fixed at 500 with early stopping. The 4 x 3 x 3 = 36-combination grid is run with 5-fold cross-validation. The best configuration, a learning rate of 0.05, max depth of 5, and subsampling of 0.85, achieves an AUC of 0.83 on the validation set. The team notes that 12 of the 36 configurations performed within 0.01 AUC of the best, indicating a flat performance landscape around the optimum and confirming that the grid was well-designed to cover the relevant region. The grid search approach required 4 hours of parallel compute and produced a well-configured model with clear documentation of which configurations were explored and how they performed.

Build the model configuration discipline that matches search methods to model complexity through The Creative Cadence Workshop.

The generative AI foundations module covers how machine learning models are trained and configured, including the hyperparameter search methods that determine whether a model is performing near its potential or is systematically misconfigured.