Why Puppeteer keeps timing out in production (and what to do instead)
Common causes of Puppeteer timeouts in production. Memory leaks, cold starts, SPA rendering. When to use a REST API instead.
Your Puppeteer screenshot works locally. Takes 2 seconds. You deploy to production.
Suddenly: timeouts. Every 3rd request fails. Your error logs are full of:
TimeoutError: Waiting for navigation to "https://example.com" failed: Timeout 30000ms exceeded
or
Error: Browser.newPage(): target page crashed
You're not alone. This is the #1 problem with running Puppeteer in production.
Why Puppeteer times out
1. Memory exhaustion
Each Puppeteer instance holds a browser process (~150MB base + page overhead). Under load, memory fills up fast.
// This looks fine...
for (let i = 0; i < 1000; i++) {
const page = await browser.newPage();
await page.goto(url);
const screenshot = await page.screenshot();
// FORGOT TO CLOSE PAGE
// page.close(); // ← This line missing
}
Forgot to close pages? Memory bloat. Browser slows down. Next page timeout.
2. Cold start penalty
Spawning a new browser process takes 5–15 seconds on first call:
// First request to your server
const browser = await puppeteer.launch(); // ← 8 seconds
const page = await browser.newPage();
await page.goto(url, { timeout: 30000 }); // ← Now 22 seconds already spent
const screenshot = await page.screenshot();
Your timeout is 30 seconds total. You've burned 8 seconds just starting the browser. Network hiccup? Timeout.
3. Single-page app rendering lag
Modern SPAs don't render on initial HTML. They load, fetch data, render.
await page.goto(url, {
waitUntil: 'networkidle2' // ← Waits for network quiet
});
If the SPA has a bug and keeps fetching data, networkidle2 waits forever (or until timeout). One bad third-party API call → entire screenshot times out.
4. Resource exhaustion under concurrency
Request 1: Browser process #1 (150MB)
Request 2: Browser process #2 (150MB)
Request 3: Browser process #3 (150MB)
...
Request 10: Out of memory, killed
Now requests 1–9 fail because the OS killed the browser.
5. DNS and network issues
If DNS is slow (even 1 second extra) or the page takes 20 seconds to load, timeout. Your 30-second window evaporates faster than expected.
Why self-hosting Puppeteer is fragile at scale
You think the problem is your code. It's not. It's the architecture.
Puppeteer at scale requires:
- Browser pool management (pre-spawn, recycle, health checks)
- Memory monitoring (kill old processes before OOM)
- Timeout handling (retry logic, fallbacks)
- Load balancing (distribute across multiple instances)
- Logging/debugging (understand why timeouts happen)
One EC2 instance can handle ~5–10 concurrent Puppeteer requests. Scale to 100 concurrent? You need 10–20 instances. Now you're managing Kubernetes, session affinity, auto-scaling, health checks, and cost ($500+/month). And it's still fragile.
Solution: REST API (2–3 second latency, zero complexity)
const response = await fetch('https://pagebolt.dev/api/v1/screenshot', {
method: 'POST',
headers: {
'x-api-key': process.env.PAGEBOLT_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({
url: 'https://example.com',
format: 'png'
})
});
const screenshot = await response.arrayBuffer();
res.set('Content-Type', 'image/png');
res.send(screenshot);
That's it. No memory management. No timeout handling. No infrastructure.
Real comparison: Puppeteer vs API
Puppeteer (production headache)
const puppeteer = require('puppeteer');
let browser;
async function init() {
browser = await puppeteer.launch({
args: ['--no-sandbox', '--disable-dev-shm-usage']
});
}
async function takeScreenshot(url) {
try {
const page = await browser.newPage();
await page.setViewport({ width: 1280, height: 720 });
await Promise.race([
page.goto(url, { waitUntil: 'networkidle2', timeout: 30000 }),
new Promise((_, reject) =>
setTimeout(() => reject(new Error('Custom timeout')), 25000)
)
]);
const screenshot = await page.screenshot({ format: 'png' });
await page.close();
return screenshot;
} catch (error) {
console.error('Screenshot failed:', error);
throw error;
}
}
// Monitor memory, restart if needed
setInterval(async () => {
const memory = process.memoryUsage();
if (memory.heapUsed > 1e9) { // 1GB
console.log('Memory high, restarting browser...');
await browser.close();
browser = await puppeteer.launch({ args: ['--no-sandbox'] });
}
}, 60000);
100+ lines of boilerplate, manual memory management, crash recovery logic — still fragile.
REST API (5-line solution)
async function takeScreenshot(url) {
const response = await fetch('https://pagebolt.dev/api/v1/screenshot', {
method: 'POST',
headers: {
'x-api-key': process.env.PAGEBOLT_API_KEY,
'Content-Type': 'application/json'
},
body: JSON.stringify({ url, format: 'png' })
});
if (!response.ok) throw new Error(`HTTP ${response.status}`);
return await response.arrayBuffer();
}
No memory management. No crash recovery. No timeout logic. The managed service handles it.
Cost reality
| Item | Puppeteer | REST API |
|---|---|---|
| Infrastructure | $50–200/month | $0 |
| Scaling | 10 instances = $500/month | Auto-scales, same cost |
| DevOps time | 5–10 hours/month | 0 hours |
| Timeout debugging | 10+ hours/month | Never |
| Total effective cost | $1,500–2,000/month | $50–100/month |
For most teams: REST API wins by 10–20x.
When to keep Puppeteer
Keep self-hosted Puppeteer if:
- ✅ Processing 10,000+ screenshots/day (economies of scale)
- ✅ Data residency requirement (EU data can't leave EU)
- ✅ Dedicated DevOps team already maintaining it
For everyone else: use an API.
Getting started
- Sign up at pagebolt.dev (free: 100 requests/month)
- Replace Puppeteer code with fetch() calls (5 minutes)
- Deploy and never think about Puppeteer timeouts again
Stop debugging Puppeteer crashes at 2 AM. Let a managed service handle it.
Stop managing Puppeteer in production
Free tier: 100 screenshots/month. No credit card. Replace your timeout-prone setup in 10 minutes.