Back to Blog
Feature May 30, 2026

Introducing /observe — one API call to make any page agent-ready

PageBolt's new /observe endpoint turns any URL into a compact, token-budgeted observation for AI agents: id-indexed interactive elements, a page-type classification, and ready-to-use suggested actions — in a single request.

AI agents are getting good at reasoning. The bottleneck now is perception: before an agent can fill a form, click "Log in," or follow a search box, it has to understand what is actually on the page in front of it.

Today most agents do this in one of two clumsy ways — and both are expensive.

The two bad options agents have today

Option 1: send a screenshot. A full-page screenshot is thousands of vision tokens, gives the model no selectors to act with, and forces it to guess pixel coordinates that break the moment the layout shifts.

Option 2: send the raw DOM. A real page is tens of thousands of tokens of nested <div> soup, inline styles, and tracking scripts. The model burns its context window parsing markup that has nothing to do with the task.

Neither gives the agent what it actually needs: what can I interact with, what kind of page is this, and what should I probably do next?

What /observe returns

/observe loads the page in a real headless browser and returns a single compact JSON observation, purpose-built for an LLM:

  • Id-indexed interactive elements — each button, input, link, and ARIA-role element with a stable CSS selector, accessible name, role, type, and state.
  • A page-type classification — a heuristic label (login, signup, search, article, form, or generic) so the agent has instant context.
  • Suggested actions — interactive elements pre-grouped by intent (login flow, search, primary buttons, navigation) so the model doesn't have to infer them from scratch.

A simple login page comes back in just a few hundred tokens — orders of magnitude leaner than a screenshot or full DOM dump. When you need more, opt in per request:

  • includeContent — clean article text as Markdown (via Mozilla Readability)
  • includeAriaTree — the full ARIA accessibility tree
  • includeScreenshot — a base64 screenshot, all from the same page load

One request, then act

Here's the whole loop. First, observe:

curl -X POST https://pagebolt.dev/api/v1/observe \
  -H "x-api-key: pf_live_your_key_here" \
  -H "Content-Type: application/json" \
  -d '{ "url": "https://example.com/login" }'

You get back something an agent can reason about directly:

{
  "url": "https://example.com/login",
  "title": "Sign in",
  "pageType": "login",
  "elements": [
    { "id": "e1", "role": "textbox", "type": "email", "name": "Email", "selector": "#email", "state": ["required"] },
    { "id": "e2", "role": "password", "name": "Password", "selector": "#password", "type": "password", "state": ["required"] },
    { "id": "e3", "role": "button", "type": "submit", "name": "Log in", "selector": "button[type='submit']" }
  ],
  "forms": [
    { "selector": "form", "action": "/login", "method": "POST", "fieldIds": ["e1", "e2", "e3"] }
  ],
  "actions": [
    { "intent": "login", "elementIds": ["e1", "e2", "e3"] }
  ],
  "stats": { "elementCount": 3, "estimatedTokens": 281 }
}

Then feed those selectors straight into /api/v1/sequence to fill the fields and click — no pixel-guessing, no brittle coordinates.

The headless perception layer for the agentic web

You may have seen WebMCP in the news — a browser-native way for sites to expose tools to agents. It's promising, but it only works for sites that instrument themselves, and it requires a visible browser tab.

The other 99% of the web won't ship WebMCP any time soon. /observe is the complement: it turns any un-instrumented URL into agent-ready structure, server-side, with no cooperation from the target site. WebMCP gives an agent access to where the user is; /observe gives it access to anywhere.

Built into the MCP server

If you drive PageBolt through the pagebolt-mcp package, observation is now a first-class tool. Claude, Cursor, and Windsurf can call observe_page directly, get the element map, and chain it into a run_sequence to act:

{
  "mcpServers": {
    "pagebolt": {
      "command": "npx",
      "args": ["-y", "pagebolt-mcp"],
      "env": { "PAGEBOLT_API_KEY": "YOUR_KEY" }
    }
  }
}

Availability

/observe is live now, costs 1 API request, and is included on every plan — including the free tier. It's the 11th API in your single PageBolt key, alongside screenshots, PDFs, OG images, sequences, video, inspect, extract, audit, ARIA, and diff. See the full reference in the docs.


Try it free — 100 requests/month, no credit card. → pagebolt.dev

Get Started Free

100 requests/month, no credit card

Screenshots, PDFs, video recording, agent observation, and browser automation — no headless browser to manage.

Get Your Free API Key →