A mathematical function that quantifies how alike or different two objects, data points, or representations are, producing a scalar score that enables ranking, clustering, retrieval, and recommendation. Similarity metrics are used throughout AI-powered marketing systems: semantic similarity between text passages drives search and content matching, cosine similarity between user and item embeddings powers recommendation, and distance metrics between audience profiles enable lookalike modeling.
Also known as distance metric, similarity measure, embedding distance
A similarity metric takes two objects as inputs and returns a scalar value indicating how similar they are. By convention, higher values indicate more similar pairs, though distance metrics invert this convention (lower distance means more similar). The choice of metric determines what “similar” means for the specific data type and task. Euclidean distance measures straight-line distance in vector space and is appropriate when all dimensions are comparably scaled and independence between dimensions is reasonable. Cosine similarity measures the angle between two vectors regardless of their magnitude, making it appropriate for comparing text embeddings or behavioral vectors where the direction (pattern of values) matters more than the absolute scale. Jaccard similarity measures the ratio of shared elements to total elements in two sets, appropriate for comparing binary feature vectors or item sets in collaborative filtering.
Semantic similarity between text passages is computed by encoding each passage into a dense embedding vector using a pre-trained language model and measuring cosine similarity between the resulting vectors. Passages with similar meaning have embedding vectors pointing in similar directions in the high-dimensional embedding space, producing high cosine similarity even when the specific words differ. This semantic similarity enables applications like semantic search (finding relevant documents by meaning rather than keyword match), duplicate detection (identifying near-duplicate content), and entailment scoring (determining whether two passages make compatible claims).
Approximate nearest neighbor (ANN) search algorithms enable fast retrieval of the most similar items from a large collection without computing similarity to every item exhaustively. Hierarchical Navigable Small World (HNSW) graphs, product quantization, and locality-sensitive hashing (LSH) are ANN methods that organize embeddings into data structures supporting sub-linear time nearest-neighbor queries. Real-time recommendation systems that retrieve the top-k most similar items to a user query embedding from a catalog of millions of items require ANN search to achieve the sub-100ms latency required for interactive applications; exhaustive cosine similarity computation over millions of items would take seconds per query.
A working ad agency using semantic search tools, deploying lookalike audience models, building product recommendation systems, or evaluating AI-generated content similarity is relying on similarity metrics as the operational mechanism. Understanding which similarity metric is appropriate for each data type and task, and how metric choice affects the quality of similarity-based applications, is necessary for evaluating and improving these systems rather than treating them as opaque outputs.
Cosine similarity between behavioral embeddings is the standard mechanism for lookalike audience construction, and its effectiveness depends on embedding quality more than metric choice. A lookalike model that represents each user as a behavioral embedding vector and finds similar users by cosine similarity produces a lookalike audience whose quality is primarily determined by how well the embedding captures the behavioral signals relevant to the target behavior. If the embedding is trained on page view sequences and the target behavior is purchase conversion, the embedding may not capture purchase-intent signals well, producing a lookalike audience that resembles the seed audience in browsing behavior but not in purchase intent. Embedding quality assessment, which evaluates whether embeddings trained on available behavioral signals capture the specific behavior targeted by the lookalike model, is the critical evaluation step that similarity metric benchmarking alone does not address.
Semantic similarity scoring of AI-generated content against reference examples provides an automated quality signal for brand voice consistency at scale. An agency generating 200 AI copy variants per campaign can score each variant’s semantic similarity to a curated set of on-brand reference examples using cosine similarity in a sentence embedding space. Variants with high average cosine similarity to the reference set are likely to be on-brand; variants with low similarity are likely to be stylistically off or topically divergent. This automated pre-screening step routes only high-similarity variants to human review, reducing review volume without applying binary pass/fail rules that would be brittle to legitimate creative variation. The similarity threshold determines the precision-recall tradeoff: a high threshold passes fewer but more reliably on-brand variants, while a lower threshold passes more variants at the cost of more off-brand material reaching human review.
Product similarity metrics for recommendation must balance semantic attribute similarity with complementary relationship detection for cross-sell applications. A product recommendation system that recommends items with high cosine similarity to the viewed item will surface items that are substitutes (very similar products in the same category), not complements (products that are typically purchased together). For cross-sell recommendations, a similarity metric that captures co-purchase frequency, which measures how often two products appear in the same purchase basket, is more appropriate than embedding cosine similarity. The correct similarity metric depends on the recommendation objective: high cosine similarity for “more like this” same-category recommendations; high co-purchase similarity for “frequently bought together” cross-sell recommendations; a combination for personalized mixed-objective recommendation lists.
An agency is building a creative performance prediction system for a retail client that predicts whether a new ad creative will be a top-quartile performer based on its visual and copy features. The agency has 1,840 labeled creatives from the prior 18 months with performance labels (top quartile vs. not top quartile). Rather than training a complex multimodal classifier from scratch, the agency uses a similarity-based approach. For each new candidate creative, they compute three similarity scores: copy semantic similarity to the top-quartile creative archive (cosine similarity of sentence embeddings), visual similarity to the top-quartile creative archive (cosine similarity of CLIP image embeddings), and combined multimodal similarity (weighted average of copy and visual similarity, with weights 0.6 and 0.4 based on cross-validation). A logistic regression trained on these three similarity features achieves AUC of 0.71 on a held-out test set of 230 creatives, versus 0.64 for a purely rule-based system using manually defined creative attributes. The similarity-based system correctly identifies 64% of top-quartile performers in the top 25% of its scored list, compared to 51% for the rule-based system. The agency uses the system as a pre-launch creative triage tool: creatives with combined similarity below 0.45 receive a flag for additional review by the creative strategy team before campaign launch. Over a 12-week pilot, the system flags 22% of submitted creatives for additional review. Of flagged creatives, 78% are revised before launch, and the revised versions show 34% better average performance than the original submissions, confirming that low similarity to proven high performers is a useful early warning signal for creative risk.
The generative AI foundations module covers similarity metrics including cosine similarity, Euclidean distance, and Jaccard similarity, embedding-based similarity search, and how these mechanisms underlie the recommendation, audience modeling, and content matching systems agencies use for client campaigns.