Why AI Agents Use inspect_page Instead of Dumping the Full DOM

You're building a Claude agent to automate web tasks. The agent needs to navigate a page and interact with buttons, forms, and links.

Your first instinct: get the full HTML and let Claude parse it.

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{
        "role": "user",
        "content": f"Here's the page HTML:\n\n{full_html_dump}"
    }]
)

Simple. Direct. Extremely expensive.

That full HTML dump is 8,000+ tokens. Claude charges $3 per 1M input tokens. One page = $0.03 per agent query. Scale to 100 queries a day, and you're spending $3/day just on token overhead.

But here's the thing: Claude doesn't need the full DOM. It needs to know what it can interact with.

The Problem: DOM Bloat

A typical website's HTML includes:

Layout divs (nesting chains, 50+ levels deep)
CSS classes and inline styles (framework boilerplate, Tailwind utilities)
Script tags and data attributes
Comment nodes and meta tags
Images, videos, analytics trackers

Result: 10,000+ DOM nodes for a page that has maybe 50 interactive elements.

The agent needs to know:

Button at coordinates X saying "Submit"
Input field for "email"
Link to "/checkout"

It doesn't need to know:

The 200-line CSS in a <style> tag
The 500 nested divs from the framework
The tracking pixels and analytics

But when you dump the full DOM, Claude has to parse all of it. Tokens wasted. Money wasted.

The Solution: Structured Element Inspection

Instead of dumping the full DOM, inspect only the interactive elements.

PageBolt's inspect_page does exactly this:

import json
import urllib.request

def inspect_page(url):
    """Get structured map of interactive elements only"""
    api_key = "YOUR_API_KEY"  # pagebolt.dev

    payload = json.dumps({"url": url}).encode()
    req = urllib.request.Request(
        'https://pagebolt.dev/api/v1/inspect',
        data=payload,
        headers={'x-api-key': api_key, 'Content-Type': 'application/json'},
        method='POST'
    )

    with urllib.request.urlopen(req) as resp:
        return json.loads(resp.read())

Returns:

{
  "buttons": [
    {"text": "Submit", "selector": "#submit-btn", "type": "primary"},
    {"text": "Cancel", "selector": ".cancel-btn", "type": "secondary"}
  ],
  "inputs": [
    {"name": "email", "selector": "#email-field", "type": "email"},
    {"name": "password", "selector": "#password-field", "type": "password"}
  ],
  "links": [
    {"text": "Forgot password?", "href": "/forgot", "selector": "a.forgot"}
  ],
  "headings": [
    {"text": "Sign In", "level": "h1"}
  ]
}

That's 500 tokens instead of 8,000.

Token Cost Comparison

Full DOM approach:

Page HTML: 8,000 tokens
Agent reasoning: 200 tokens
Response: 100 tokens
TOTAL: 8,300 tokens per query

inspect_page approach:

Structured element map: 500 tokens
Agent reasoning: 200 tokens
Response: 100 tokens
TOTAL: 800 tokens per query

Savings: 90% reduction in tokens

100 agent queries:

Full DOM: $0.25
inspect_page: $0.02

Scale to 10,000 queries a month, and you're saving $75/month. For a startup, that's meaningful.

Real Example: Multi-Page Automation

import anthropic
import json
import urllib.request

client = anthropic.Anthropic()

def inspect_page(url):
    api_key = "YOUR_API_KEY"
    payload = json.dumps({"url": url}).encode()
    req = urllib.request.Request(
        'https://pagebolt.dev/api/v1/inspect',
        data=payload,
        headers={'x-api-key': api_key, 'Content-Type': 'application/json'},
        method='POST'
    )
    with urllib.request.urlopen(req) as resp:
        return json.loads(resp.read())

def automate_workflow(task):
    """Agent navigates multiple pages efficiently"""

    pages = [
        "https://example.com/login",
        "https://example.com/account",
        "https://example.com/settings",
        "https://example.com/billing",
        "https://example.com/confirm"
    ]

    total_tokens_used = 0

    for page_url in pages:
        page_elements = inspect_page(page_url)

        response = client.messages.create(
            model="claude-3-5-sonnet-20241022",
            max_tokens=256,
            messages=[{
                "role": "user",
                "content": f"""
Task: {task}

Current page elements:
{json.dumps(page_elements, indent=2)}

What should the agent do next? Respond with a single action."""
            }]
        )

        total_tokens_used += response.usage.input_tokens + response.usage.output_tokens
        print(f"Page {page_url}: {response.content[0].text}")

    print(f"\nTotal tokens for 5-page workflow: {total_tokens_used}")
    print(f"Cost: ${(total_tokens_used / 1_000_000) * 3:.3f}")

automate_workflow("Complete the account setup and enable 2FA")

With full DOM: ~41,500 tokens, ~$0.12
With inspect_page: ~4,000 tokens, ~$0.01

That's 10x cheaper. Same automation. Same results.

Why This Matters

Token cost is becoming the limiting factor for AI automation. As agents run longer workflows and access more pages, efficiency compounds.

Claude agents are already cheaper than hiring humans. But inefficient agent implementations waste that advantage.

The lesson: give your agent exactly the information it needs, not everything. Your token bill and your agent's reasoning speed will both improve.

Try PageBolt free — 100 requests/month, no credit card needed. →

Your Claude agents will be smarter and cheaper. That's the power of structural inspection.