The natural language processing task of identifying when different words or phrases in a text refer to the same real-world entity, such as linking “she” back to “the CMO” two sentences earlier. For agencies using AI tools that read and interpret documents, coreference resolution is what separates a system that follows the thread of a conversation from one that just processes words.
Also known as anaphora resolution, entity reference resolution, reference tracking
Coreference resolution maps mentions of the same entity across a text. When a document says “Sarah joined the call, and then she presented the results,” coreference resolution links “she” back to “Sarah.” Without this step, a language model treats each mention as a separate entity and loses the thread of who is doing what, which compounds quickly in multi-party conversations and long documents.
Large language models perform coreference resolution implicitly as part of their training on billions of text examples. They learn that pronouns tend to refer back to recently named entities and that context usually narrows down the candidate. But implicit resolution still fails on ambiguous text, long documents, and multi-party conversations where several people share attributes or roles.
As a discrete task, coreference resolution is evaluated and improved separately from generation, and used to strengthen downstream systems: meeting summarization, contract analysis, conversation intelligence, and document search.
Agencies process significant volumes of unstructured text: client briefs, strategy documents, call transcripts, competitive research, and campaign reviews. When AI tools read and extract meaning from these documents, coreference resolution quality directly affects what they get right and what they silently misattribute.
Transcript analysis is where it matters most. Call transcripts are full of pronouns and implied references. A summary tool that cannot resolve “they said they would approve it by end of quarter” to the right client and the right deliverable produces noise, not intelligence. The error is invisible unless the reader already knows the answer.
Brief comprehension degrades on shorthand. Creative briefs frequently use shared context and abbreviation. An AI assistant that cannot track which “it” refers to the campaign, which “they” refers to the brand team, and which “this” refers to last year’s performance will misread the brief and generate work against the wrong brief.
Misattribution in client contexts is a trust problem. Clients notice when AI summaries get attribution wrong, placing a statement with the wrong speaker or losing the thread of who committed to what. A tool that misattributes statements becomes a liability in any context where accuracy is required, especially in pitch recaps or post-meeting follow-ups.
In practice, coreference resolution sits inside the systems agencies use rather than being something anyone configures directly. But understanding it explains why a meeting summary tool performs differently on a two-person call versus a twelve-person strategy session, or why an AI brief analyzer handles tightly written briefs better than sprawling ones with heavy implicit context.
Agencies evaluating AI tools for research, summarization, or document analysis should test those tools on their actual documents, not clean demo examples. Coreference accuracy degrades predictably on long, complex, or ambiguous inputs, and that is exactly where most real agency documents live.
The generative AI foundations module of the workshop covers how today’s language models work at a level that helps agency practitioners choose better tools, write better prompts, and catch the errors that matter.