AI Glossary · Letter C

Computer Use.

An AI capability that lets a model directly operate software interfaces by clicking, typing, scrolling, and navigating, rather than requiring an API or structured data connection to interact with an application.

Also known as UI automation, GUI automation, browser automation via AI

What it is

A working definition of computer use.

Computer use refers to an AI model’s ability to perceive and interact with a software interface the way a human does: by looking at what’s on screen, identifying the relevant elements, and taking actions like clicking, typing, selecting, scrolling, and navigating. Rather than connecting to an application through an API or structured data feed, a computer-use agent sees a screenshot of the screen and outputs a sequence of actions to take on it.

The practical significance of computer use is that it removes the requirement for a structured integration. Most automation tools require the target application to have a usable API. A large share of the software that agencies use every day — ad platforms, client reporting dashboards, internal project management tools, media planning spreadsheets, procurement systems — either has no API, has an API with limited functionality, or requires significant technical work to connect. Computer use makes those applications automatable without a structured integration, because the AI interacts with them through the interface rather than through the data layer.

The architecture of a computer-use agent typically involves a vision model that interprets screenshots, a planning model that decides what actions to take given the current state of the interface and the goal, and an action execution layer that translates those decisions into actual inputs. The system operates in a loop: observe the current screen state, decide what to do next, take an action, observe the resulting screen state, and continue until the task is complete or the system determines it cannot proceed.

Why ad agencies care

Why computer use opens up automation that API-first tools can’t touch.

Ad agencies have a large number of manual, interface-dependent tasks that are currently done by humans because they require navigating software that doesn’t expose a clean API: pulling screenshots from ad platforms, entering creative specs into trafficking systems, updating media plans in shared spreadsheets, extracting data from client reporting portals, and submitting creative assets through vendor upload interfaces. Computer use makes these tasks candidates for automation in a way that API-first tools cannot.

The risk profile requires attention. An agent that can take actions in software can take incorrect actions in software. Giving a computer-use agent authorization to click “publish,” “submit,” or “confirm” on behalf of a client without a human checkpoint creates meaningful exposure. Agencies building computer-use automations need to be deliberate about where human authorization is required before an action is taken.

The integration cost is low relative to API-based automation. Because computer use doesn’t require an API, it doesn’t require a developer to build and maintain an integration. The operational overhead moves toward prompt design, testing the agent against real interface states, and building human review checkpoints, rather than toward technical integration work.

The scope of what’s automatable expands significantly. Any task that a human currently performs by looking at a screen and clicking through a sequence of steps is now a candidate for computer-use automation. For agencies that have been constrained by the availability of API integrations, that represents a large and previously inaccessible category of work.

In practice

What computer use looks like inside a working ad agency.

An agency trafficking team spends several hours each week pulling creative performance screenshots from three ad platforms for inclusion in client reports. The screenshots require navigating to specific campaign views, applying date filters, and capturing the relevant data views for each client. A computer-use agent is configured to handle this workflow: it opens each platform in sequence, navigates to the correct campaign, applies the date filter, captures the screenshot, and saves it to the shared folder where the reporting team pulls assets. The trafficking team confirms the output each week before the screenshots are used in client deliverables. The task goes from two hours of manual work to a 10-minute review, and the trafficking team’s Friday afternoon is no longer consumed by a task that requires no judgment.

Build AI workflows that actually run through The Creative Cadence Workshop.

The automations and agents module of the workshop teaches you how to build AI workflows that compress the busywork without taking the craft out of the studio.