Skip to content

Instantly share code, notes, and snippets.

@oneryalcin
Created May 29, 2026 20:07
Show Gist options
  • Select an option

  • Save oneryalcin/e7ec88c90d22a44e98e24af8527b331d to your computer and use it in GitHub Desktop.

Select an option

Save oneryalcin/e7ec88c90d22a44e98e24af8527b331d to your computer and use it in GitHub Desktop.
Agentic Browsers & Headless Browser Infrastructure — Options Matrix (CDP / self-host / stealth). Maintained reference.

Agentic Browsers & Headless Browser Infrastructure — Options Matrix

Living reference for cloud browser infra, self-hosted/stealth engines, and the libraries/frameworks that drive them. Curated for agentic web automation at scale (e.g. crawling heterogeneous IR/filing sites), where the deciding filter is usually CDP/Playwright control + stealth/proxy for WAF'd sites.

Last verified: 2026-05-29 · Maintainer: @oneryalcin


How to read this

  • CDP / Playwright — does it expose a Chrome DevTools Protocol endpoint (raw wss://) or a Playwright/Puppeteer-compatible connection? This is the must-have for agentic loops that reason over the DOM (find a link, read selectors, build a crawl manifest) rather than over pixels. Tools that only return final HTML (scrape APIs) or only drive via vision/mouse can't host an interactive DOM agent unchanged.
  • Self-host — can you run it on your own infra (e.g. GKE) with no per-session fee?
  • Stealth / Proxy / CAPTCHA — built-in anti-detection, residential proxy routing, and CAPTCHA solving (the lever for the WAF-blocked long tail: Cloudflare / Akamai / Incapsula).
  • Maturity — rough production-readiness, not a benchmark.
  • ✅ = supported · ⚠️ = partial/caveat · ❌ = no/not applicable · ❓ = unverified

⚠️ Claims discipline: performance numbers (e.g. "11x faster") are vendor/marketing claims unless a benchmark link is given. Architecture facts (CDP support, control model) are verified from each vendor's API reference / docs, not their landing page — landing pages oversell. Re-verify before betting a production run on any single tool.


1. Cloud agentic browsers (managed, CDP/Playwright)

Direct Browserbase-class options. These host the browser; you keep programmatic control.

Tool CDP / Playwright Self-host Stealth / Proxy / CAPTCHA Entry price Maturity Notes
Browserbase ✅ CDP + Stagehand SDK ✅ (Stealth Mode, custom Chromium) $39/mo (200 hrs, 3 concurrent); $99/mo (500 hrs, 50 concurrent) Established — 50M sessions/2025, $40M Series B @ $300M Market leader; Stagehand adds act()/extract()/observe() AI primitives. Usage-based pricing.
Steel.dev ✅ CDP/Playwright OSS, Apache-2.0 (Docker) ✅ (cloud tier) Cloud: free 100 hrs then $29/mo; self-host ~$5–15/mo infra Established Best self-host story. Run fleets on your own GKE, no per-session fee, full CDP. Batteries-included session API + UI.
Browserless ✅ CDP/Playwright/Puppeteer/Selenium ✅ (Docker, private deploy) BrowserQL (strongest stealth: hidden-iframe/shadow-DOM verify clicks, CAPTCHA auto-solve) ~$140/mo cloud; free 1k units; self-host available Established Best raw anti-bot engine + self-host. Good for the Cloudflare/Akamai tail.
Hyperbrowser (YC) ✅ CDP/Playwright/Puppeteer + MCP ✅ Stealth + Ultra Stealth, fingerprint randomization, CAPTCHA $30/mo (25 concurrent); $0.10/browser-hr + $10/GB proxy Newer ("less proven") Bursts to 10k+ concurrent. HyperAgent framework. LangChain/LlamaIndex/MCP native.
Anchor Browser ✅ (CDP, computer-use flavored) ✅ stealthy Chromium fork + built-in VPN (no 3rd-party proxy) $20 starter; $0.05/browser-hr + proxy; full-stealth tier $2,000/mo Newer Auth/identity handling focus; "plan deterministically, revert to AI at runtime." Full-stealth tier pricey.
Solari Browser connectOverCDP(session.cdpEndpoint) + Puppeteer browserWSEndpoint; wire-protocol mode on patchright-core@1.59.3 ✅ stealth + residential proxy + CAPTCHA (opt-in) Free; $20/mo Starter (20 concurrent, 200 hrs); Enterprise custom Research preview Control is DOM/CDP ("anything that works in Playwright works on Solari") despite landing page's "vision/keyboard/mouse" wording. Cheap; sub-second start. Maturity caveat.
Cloudflare Browser Rendering / Browser Run ✅ CDP (Playwright/Puppeteer) + MCP client ⚠️ (CF infra; limited stealth claims) Usage-based on CF Established (CF) Serverless headless Chrome; integrates if you're already on Cloudflare. MCP support for Claude/Cursor.

2. Self-hosted / stealth engines (OSS, run-it-yourself)

For maximum control + cost optimization at scale. You bring infra (and DevOps).

Tool CDP / Playwright Stealth Footprint / Lang Stars / Status Notes
Playwright (Microsoft) ✅ native (CDP for Chromium; own protocol for FF/WebKit) ❌ baseline (detectable) Node/Python/Java/.NET De-facto standard The control layer everything else wraps. No stealth on its own.
Puppeteer (Google) ✅ CDP (Chromium/Chrome) ❌ baseline Node De-facto standard Chrome-focused; puppeteer-extra-plugin-stealth adds patches.
Patchright ✅ Playwright-compatible (drop-in) ✅ stealth-patched Playwright runtime Python/Node Active Stealth fork of Playwright. Note: Solari's wire-protocol mode pins patchright-core@1.59.3.
Camoufox (daijro) ✅ Playwright-compatible ✅ Firefox fork, C-level fingerprint spoofing (canvas/WebGL/screen/navigator), Playwright agent sandboxed ~200 MB / C++ ~8.8k★, active again ⚠️ Caveats: ~1-yr maintenance gap, degraded on newer Firefox base; open issue #555: Akamai blocks Camoufox but not stock Firefox via Playwright. Strong concept, verify stealth before trusting on hard sites. MPL-2.0.
nodriver / undetected-chromedriver ⚠️ CDP-ish (Selenium lineage) ✅ anti-detection Python Popular Common stealth route for Selenium-based stacks.
Lightpanda ✅ CDP (Playwright/Puppeteer) + MCP ❌ (proxy/header support, no fingerprint stealth) ~9x less mem (claim) / Zig, V8 JS ~30.7k★, beta ⚠️ Does not render graphically → "11x faster" claim (unbenchmarked here) comes from skipping rendering. Great cheap fast-path for static-HTML sites; coverage gaps on complex JS/dynamic sites. Not a Browserbase replacement for hard cases.
Obscura (h4ckf0r0day) ✅ CDP ✅ stealth ~30 MB (claim) / Rust, V8 ~13.9k★, active Real & maintained but no production track record found — fine for a spike, too green for a 48k run.

3. Agent frameworks (drive a browser; not infra themselves)

These sit on top of an engine/infra. Different layer from §1–2.

Tool What it is Notes
Stagehand (Browserbase) Playwright + AI primitives (act/extract/observe) OSS SDK; bridges scripted Playwright and full agents.
browser-use "Make websites accessible for AI agents" Popular OSS agent loop; can drive CDP browsers (incl. self-hosted).
Skyvern LLM + computer-vision automation (no brittle selectors) OSS, Playwright-compatible, CAPTCHA + 2FA. Competes with a custom manifester agent, not with the infra.
Vercel agent-browser Rust CLI giving agents browser control agent-browser click @e2 style.

4. Proxy / unblocker APIs (the WAF long-tail)

Reach for these for the small % of sites blocked by Akamai/Cloudflare/Incapsula. Most expose a CDP-compatible "scraping browser" or a fetch-style unblocker. Fetch-only APIs (no CDP) can't host an interactive DOM agent — use as fallback fetchers, not as the manifester.

Tool CDP browser? Notes / benchmark
Bright Data Scraping Browser / Web Unlocker ✅ CDP-compatible Enterprise unblocking on their proxy network.
Zyte API ⚠️ rendering + actions API Led Proxyway 2025 benchmark: 93.14% success on 15 heavily-protected sites.
Oxylabs Web Unblocker / Scraper API ⚠️ headless rendering Agentic tooling (OxyCopilot); sub-second pre-indexed content for RAG.
ScrapingBee ❌ no CDP 84.47% Proxyway 2025. Simple API; fallback fetcher only — not a manifester fit.

5. Off-category (not for headless agentic scraping)

Listed to prevent re-evaluation — these are terminal browsers for humans over SSH, not headless agent infra.

Tool Why it's here / why it's off-category
Carbonyl Full Chromium in your terminal (~19k★) but last push 2024 (going stale). For human SSH use, not agent automation.
Browsh Text-mode browser backed by headless Firefox; renders to terminal. Human tool.

Decision guide (for DOM-driven agentic crawling at scale)

  1. Want self-host + no per-session fee + full CDP?Steel (primary candidate).
  2. Need the strongest stealth for Cloudflare/Akamai sites?Browserless / BrowserQL (self-host option) or a proxy-unblocker (Bright Data / Zyte) for the tail.
  3. Want cheap managed + fast start, OK with preview maturity?Hyperbrowser or Solari (both CDP-compatible).
  4. Have a huge static-HTML majority and want to slash cost/latency? → spike Lightpanda as a fast-path engine, fall back to full Chromium for dynamic sites.
  5. Self-hosting stealth specifically?Patchright (Playwright drop-in) > Camoufox (verify Akamai regression first).
  6. Avoid for DOM agents: fetch-only scrape APIs (ScrapingBee) and vision-only / terminal browsers.

At large scale, browser infra is often a minor line item vs LLM cost — optimize for coverage on protected sites and avoiding lock-in (favor CDP-standard + self-hostable) over squeezing per-session price.


Changelog

  • 2026-05-29 — Initial version. Verified CDP support for Solari (docs: session.cdpEndpoint, connectOverCDP), Camoufox issue #555 (Akamai regression), GitHub stars/activity for Lightpanda/Camoufox/Obscura/Carbonyl. Confirmed "Solari Browser" is a real product (earlier dismissal retracted).

Maintenance notes

  • Re-verify pricing + maturity quarterly; vendors change tiers often.
  • When adding a tool: settle CDP support from its API reference / connection code, not the homepage.
  • Mark every perf number as a claim unless paired with a benchmark link.
  • Quick repo health check: curl -s https://api.github.com/repos/<owner>/<repo> | jq '{stars:.stargazers_count, lang:.language, pushed:.pushed_at, archived}'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment