You're building a FastAPI service and need to capture screenshots of URLs. Maybe you're building a:
- Link preview service for a Slack-like app
- Monitoring tool that alerts on visual changes
- Content moderation system that screenshots submitted URLs
- OG image generator for a blogging platform
- Automated report builder with visual page captures
The instinct is to reach for Playwright or Puppeteer. They work. But running a headless browser inside a FastAPI service is a liability — memory spikes, async lifecycle management, container bloat, and a list of Linux dependencies that make your Dockerfile look like a security incident.
There's a cleaner path: delegate the browser work to a hosted screenshot API and keep your FastAPI service lean. Let me show you both approaches so you can make the right call.
The Playwright Approach
Playwright has a great async API that pairs naturally with FastAPI. Here's a basic implementation:
from fastapi import FastAPI
from playwright.async_api import async_playwright
app = FastAPI()
@app.get("/screenshot")
async def screenshot(url: str):
async with async_playwright() as p:
browser = await p.chromium.launch()
page = await browser.new_page()
await page.goto(url)
screenshot_bytes = await page.screenshot()
await browser.close()
return Response(content=screenshot_bytes, media_type="image/png")
It works in development. But production surfaces a pile of problems:
1. Cold-start latency
Launching a browser per request adds 1–3 seconds of overhead. The standard fix is a shared browser instance:
from contextlib import asynccontextmanager
from playwright.async_api import async_playwright, Browser
browser: Browser = None
@asynccontextmanager
async def lifespan(app: FastAPI):
global browser
playwright = await async_playwright().start()
browser = await playwright.chromium.launch()
yield
await browser.close()
app = FastAPI(lifespan=lifespan)
Now you have a long-lived browser process. Fine — until it crashes, leaks memory, or gets into a bad state after 10,000 pages. You need monitoring, auto-restart logic, and crash recovery.
2. Concurrency and memory
Each browser page takes 50–150MB of RAM. With FastAPI's async model, 20 concurrent screenshot requests means 1–3GB of RAM consumed just by browser tabs. You need a semaphore or pool:
import asyncio
semaphore = asyncio.Semaphore(5) # max 5 concurrent screenshots
@app.get("/screenshot")
async def screenshot(url: str):
async with semaphore:
page = await browser.new_page()
try:
await page.goto(url, timeout=30000)
await page.wait_for_load_state("networkidle")
data = await page.screenshot()
finally:
await page.close()
return Response(content=data, media_type="image/png")
Now requests queue up when the pool is full. What's the right limit? Depends on your server RAM. Get it wrong and you OOM-kill your API.
3. Docker image size
Playwright + Chromium adds ~600MB to your Docker image and requires system deps:
# Dockerfile additions required for Playwright
RUN apt-get update && apt-get install -y \
libnss3 libatk-bridge2.0-0 libdrm2 libxkbcommon0 \
libgbm1 libasound2 libxss1 libgtk-3-0 \
&& rm -rf /var/lib/apt/lists/*
RUN pip install playwright && playwright install chromium
Your lightweight FastAPI service is now a 700MB image with browser binaries baked in.
The API Approach
Now the same endpoint using a hosted screenshot API:
import os
import httpx
from fastapi import FastAPI, HTTPException
from fastapi.responses import Response
app = FastAPI()
PAGEBOLT_API_KEY = os.environ["PAGEBOLT_API_KEY"]
@app.get("/screenshot")
async def screenshot(url: str):
async with httpx.AsyncClient(timeout=30) as client:
resp = await client.post(
"https://pagebolt.dev/api/v1/screenshot",
headers={"x-api-key": PAGEBOLT_API_KEY},
json={"url": url}
)
if resp.status_code != 200:
raise HTTPException(status_code=502, detail="Screenshot service error")
return Response(content=resp.content, media_type="image/png")
That's the whole endpoint. No browser lifecycle. No memory pools. No system dependencies. Your Docker image stays under 100MB.
A Production-Ready FastAPI Service
Here's a complete, production-ready FastAPI app with screenshot, PDF, and full-page capture endpoints:
import os
import httpx
import logging
from fastapi import FastAPI, HTTPException, Query
from fastapi.responses import Response
from pydantic import BaseModel, HttpUrl
logger = logging.getLogger(__name__)
PAGEBOLT_API_KEY = os.environ["PAGEBOLT_API_KEY"]
PAGEBOLT_BASE = "https://pagebolt.dev/api/v1"
app = FastAPI(title="Web Capture Service")
class ScreenshotRequest(BaseModel):
url: HttpUrl
width: int = 1280
height: int = 800
full_page: bool = False
dark_mode: bool = False
block_ads: bool = True
@app.post("/capture/screenshot")
async def capture_screenshot(body: ScreenshotRequest):
payload = {
"url": str(body.url),
"width": body.width,
"height": body.height,
"fullPage": body.full_page,
"darkMode": body.dark_mode,
"blockAds": body.block_ads,
"blockBanners": True,
}
async with httpx.AsyncClient(timeout=30) as client:
resp = await client.post(
f"{PAGEBOLT_BASE}/screenshot",
headers={"x-api-key": PAGEBOLT_API_KEY},
json=payload,
)
if resp.status_code != 200:
logger.error("Screenshot failed: %s %s", resp.status_code, resp.text)
raise HTTPException(status_code=502, detail="Screenshot failed")
return Response(content=resp.content, media_type="image/png")
@app.post("/capture/pdf")
async def capture_pdf(url: str = Query(..., description="URL to convert to PDF")):
async with httpx.AsyncClient(timeout=45) as client:
resp = await client.post(
f"{PAGEBOLT_BASE}/pdf",
headers={"x-api-key": PAGEBOLT_API_KEY},
json={"url": url, "format": "A4", "printBackground": True},
)
if resp.status_code != 200:
raise HTTPException(status_code=502, detail="PDF generation failed")
return Response(
content=resp.content,
media_type="application/pdf",
headers={"Content-Disposition": "attachment; filename=capture.pdf"},
)
Handling Errors and Retries
For production services, add retry logic for transient failures:
import asyncio
async def capture_with_retry(url: str, retries: int = 2) -> bytes:
last_err = None
for attempt in range(retries + 1):
try:
async with httpx.AsyncClient(timeout=30) as client:
resp = await client.post(
f"{PAGEBOLT_BASE}/screenshot",
headers={"x-api-key": PAGEBOLT_API_KEY},
json={"url": url, "blockBanners": True},
)
if resp.status_code == 200:
return resp.content
last_err = f"HTTP {resp.status_code}: {resp.text}"
except httpx.TimeoutException:
last_err = "timeout"
if attempt < retries:
await asyncio.sleep(1.5 ** attempt)
raise HTTPException(status_code=502, detail=f"Screenshot failed after {retries+1} attempts: {last_err}")
Streaming the Response to the Client
If your clients are browsers downloading large full-page screenshots, stream the response instead of buffering it in memory:
from fastapi.responses import StreamingResponse
@app.get("/capture/stream")
async def stream_screenshot(url: str):
async def generate():
async with httpx.AsyncClient(timeout=30) as client:
async with client.stream(
"POST",
f"{PAGEBOLT_BASE}/screenshot",
headers={"x-api-key": PAGEBOLT_API_KEY},
json={"url": url, "fullPage": True},
) as resp:
if resp.status_code != 200:
raise HTTPException(status_code=502)
async for chunk in resp.aiter_bytes(chunk_size=8192):
yield chunk
return StreamingResponse(generate(), media_type="image/png")
Background Tasks for Async Workflows
For high-volume workloads, decouple capture from the HTTP response using FastAPI's BackgroundTasks:
from fastapi import BackgroundTasks
import asyncio
results: dict[str, bytes | None] = {}
async def run_capture(job_id: str, url: str):
try:
async with httpx.AsyncClient(timeout=30) as client:
resp = await client.post(
f"{PAGEBOLT_BASE}/screenshot",
headers={"x-api-key": PAGEBOLT_API_KEY},
json={"url": url},
)
results[job_id] = resp.content if resp.status_code == 200 else None
except Exception:
results[job_id] = None
@app.post("/capture/async")
async def capture_async(url: str, background_tasks: BackgroundTasks):
import uuid
job_id = str(uuid.uuid4())
results[job_id] = None
background_tasks.add_task(run_capture, job_id, url)
return {"job_id": job_id, "status": "queued"}
@app.get("/capture/async/{job_id}")
async def get_capture_result(job_id: str):
if job_id not in results:
raise HTTPException(status_code=404)
data = results[job_id]
if data is None:
return {"status": "pending or failed"}
return Response(content=data, media_type="image/png")
For production use, replace the in-memory results dict with Redis or a database.
Comparison: Playwright vs Hosted API in FastAPI
| Factor | Playwright in-process | PageBolt API |
|---|---|---|
| Docker image size | ~700MB | ~80MB |
| RAM per concurrent request | 50–150MB | <1MB |
| Cold start overhead | 1–3s (browser launch) | 0s |
| Browser crash handling | Your problem | Handled |
| Bot detection bypass | Manual tuning | Built-in |
| Infrastructure scaling | Scale with your app | No change needed |
| Lines of boilerplate | 60–100+ | 10–15 |
When to Use Each Approach
Use Playwright in-process when:
- You need complex multi-step browser automation (form fills, clicks, JS evaluation)
- You're capturing authenticated pages behind a complex session
- You need real-time interaction, not just a page capture
- You have no external network access (air-gapped environment)
Use a hosted screenshot API when:
- Your core service isn't browser automation — screenshots are a feature, not the product
- You want a lightweight Docker image
- You need to scale without provisioning browser infrastructure
- You want bot-detection bypass without ongoing maintenance
- Reliability SLA matters and you don't want to own browser crash recovery
Getting Started
To run the examples above:
pip install fastapi uvicorn httpx
Get a free API key at pagebolt.dev/dashboard. The free tier includes 100 requests/month with no credit card required.
export PAGEBOLT_API_KEY=your_key_here
uvicorn main:app --reload
# Test it
curl "http://localhost:8000/capture/screenshot?url=https://example.com" --output out.png
Add web capture to your FastAPI app today
PageBolt handles the browser so you don't have to. Screenshot, PDF, full-page, OG image — one API, no infrastructure.
Get your free API key