Back to Blog
Comparison March 2, 2026

Headless browser API: Self-hosted vs managed, when each makes sense

Compare self-hosted headless browsers (Puppeteer, Playwright, Selenium) vs hosted APIs. Cost, complexity, and when to choose each.

You need to automate browser tasks — screenshots, PDFs, form fills, testing. You have two paths:

  1. Self-hosted — Run Puppeteer/Playwright on your servers
  2. Hosted API — Call a managed headless browser service

Each has tradeoffs. Most teams pick wrong and regret it.

The self-hosted trap

Self-hosting a headless browser sounds simple: npm install puppeteer, write a script, deploy. In reality:

// This looks easy...
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
const screenshot = await page.screenshot();

But production is messy.

Hidden costs of self-hosting

Infrastructure

  • Each browser instance needs 300–500MB RAM
  • 10 concurrent requests = 3–5GB RAM minimum
  • Add margin for spikes = you need 8GB+ instance
  • EC2 instance: $50–150/month just for browser capacity

Orchestration

  • Browser pools fail silently
  • Connection timeouts need retry logic
  • Memory leaks require process recycling
  • You're now managing lifecycle, health checks, auto-restart

Scaling

  • Vertical scaling hits ceiling (instance size limit)
  • Horizontal scaling adds complexity (load balancing, session affinity)
  • 100 concurrent users = multiple servers, Kubernetes cluster management

Maintenance

  • Chrome versions change → tests break
  • Security patches → deployments
  • Dependency updates → regression testing
  • 5+ hours/month firefighting

Real cost (often hidden):

  • Infrastructure: $50–300/month
  • DevOps time: 5–10 hours/month (~$1,000–2,000)
  • Opportunity cost: time spent firefighting vs building features
  • Total: $1,500–2,500/month (in most companies' effective hourly rate)

When self-hosting makes sense

Self-hosting is worth it if:

  • ✅ You're running 1,000+ screenshots/day (economies of scale)
  • ✅ You have a dedicated DevOps engineer anyway
  • ✅ You need sub-millisecond response times (not possible over HTTP)
  • ✅ You have strict data residency requirements (EU data never leaves EU)
  • ✅ Your use case is internal-only (no user-facing latency pressure)

For most teams: not worth it.

The hosted API approach

Hosted headless browser APIs invert the tradeoff:

curl -X POST https://pagebolt.dev/api/v1/screenshot \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com"}'

# Response: PNG in 2-3 seconds
# No infrastructure. No servers. Done.

Advantages of hosted

Zero infrastructure — No servers to manage. No scaling to worry about. No DevOps work.

Fast — API latency: 2–3 seconds. Consistent performance, no cold start penalty.

Reliable — 99.9% uptime SLA. Automatic failover. Managed by specialists.

Scalable — 1 request or 10,000/day — same API. No performance degradation. Auto-scaling built-in.

Cost-predictable — Per-request pricing. No surprise infrastructure bills. Scale down anytime.

Self-hosted vs hosted: Direct comparison

FactorSelf-hostedHosted API
Setup2–3 hours10 minutes
Infra cost/month$50–300$0
DevOps time/month5–10 hours0 hours
Latency5–10s (cold start)2–3s
ScalingVertical (capped)Unlimited
Uptime99% (if lucky)99.9% SLA
On-call stressHighNone
Per-screenshot cost$0.05–0.20 (infra)$0.01–0.03 (API)
Best forInternal tools, high volumeUser-facing, unpredictable load

Real-world example: E-commerce screenshots

Scenario: Capture product page screenshots for every listing (500 new products/day).

Self-hosted approach

const puppeteer = require('puppeteer');

// Launch browser pool
const POOL_SIZE = 5;
const browsers = [];

async function initPool() {
  for (let i = 0; i < POOL_SIZE; i++) {
    browsers.push(await puppeteer.launch({
      args: ['--no-sandbox', '--disable-dev-shm-usage']
    }));
  }
}

let currentBrowser = 0;
async function captureScreenshot(url) {
  const browser = browsers[currentBrowser++ % POOL_SIZE];
  try {
    const page = await browser.newPage();
    await page.goto(url, { waitUntil: 'networkidle2', timeout: 30000 });
    const screenshot = await page.screenshot({ format: 'jpeg', quality: 90 });
    await page.close();
    return screenshot;
  } catch (error) {
    console.error(`Failed to capture ${url}:`, error);
    return null;
  }
}

// Run on 5x EC2 t3.large ($0.10/hour each)
// 120 hours/month = $300/month infrastructure
// Plus: monitoring, alerting, debugging, scale planning

Cost: $300+ infrastructure + 5–10 hours DevOps = ~$1,500/month effective cost.

Hosted API approach

# Daily cron job: capture 500 screenshots
for product_id in $(curl https://api.example.com/products/new); do
  curl -sX POST https://pagebolt.dev/api/v1/screenshot \
    -H "x-api-key: $PAGEBOLT_API_KEY" \
    -H "Content-Type: application/json" \
    -d "{\"url\": \"https://store.example.com/product/$product_id\"}" \
    --output "screenshots/$product_id.png"
done

# Cost: 500/day × 30 days = 15,000 requests/month → Growth plan = $79/month
# DevOps: 0 hours
# Infra: $0
# Total: $79/month (no hidden costs)

Cost: $79/month (Growth plan), $0 infrastructure, $0 DevOps = $79/month total.

Decision tree: Self-hosted or hosted?

Ask these questions:

  1. Volume: 1,000+ requests/day?
    • Yes → Self-hosted might pay off (if you have DevOps)
    • No → Hosted API is cheaper
  2. Predictability: Do you know your peak load?
    • No → Hosted API (no scaling surprises)
    • Yes → Could go either way
  3. Data sensitivity: Must data stay in-region?
    • Yes → Self-hosted (or check API provider's data residency)
    • No → Hosted API
  4. DevOps capacity: Do you have someone dedicated?
    • No → Hosted API (essential)
    • Yes → Self-hosted becomes viable
  5. Time-to-market: Do you need this running today?
    • Yes → Hosted API (10 minutes vs 2–3 hours)
    • No → Could go either way

If you answer "No" to 3+ questions above, use a hosted API.

For most teams (especially startups, small teams, unpredictable load): hosted wins.

Hybrid approach: Best of both?

Some teams try hybrid:

  • Internal dashboards/tools: self-hosted Puppeteer (full control, zero latency)
  • User-facing features: hosted API (reliability, scaling, no DevOps)

This works if you have 2+ distinct use cases with very different requirements. For most: overkill complexity.

Getting started with hosted

  1. Sign up at pagebolt.dev (free: 100 requests/month)
  2. Get API key from your dashboard (1 minute)
  3. Make your first API call (5 minutes)
  4. Evaluate: Does it solve your use case?
  5. Migrate if it does — or keep self-hosting if the numbers work out

The real question

Self-hosting isn't about "better control" or "not trusting external APIs." It's about: Can you afford 5+ hours/month of DevOps overhead plus on-call stress?

If yes: self-hosted is viable. If no: a hosted API solves the problem immediately.

Most teams say no.

Skip the infrastructure — start in 10 minutes

Free tier: 100 screenshots/month. No credit card. No servers to manage.