Blog / Spider Browser Scores 85% on Browser Use's Stealth Benchmark

Spider Browser Scores 85% on Browser Use's Stealth Benchmark

Browser Use open-sourced a stealth benchmark testing cloud browsers against 80 anti-bot protected sites. We ran it with Spider Browser and scored 85%.

4 min read Jeff Mendez

Browser Use recently open-sourced a stealth benchmark that tests cloud browser providers against anti-bot protected websites. It is a well-designed test, and we were curious how Spider Browser would perform on it.

We forked their repo, added Spider Browser as a provider, and ran the same 80 tasks with the same LLM judge and methodology. Spider Browser scored 85%, compared to Browser Use Cloud’s 81%.

No modifications to the test suite. No cherry-picked results. The benchmark and our full results are open source at github.com/spider-rs/benchmark.

The benchmark

Browser Use built Stealth Bench V1 from 300,000 security check events in their production traffic. The test set covers 80 websites protected by every major anti-bot vendor: Cloudflare, PerimeterX, Datadome, Akamai, reCaptcha, hCaptcha, GeeTest, Kasada, Shape, and custom solutions.

Each task is simple by design. Navigate to a protected site, perform a basic interaction, read content. Three steps, no authentication. If it fails, the browser got blocked.

An LLM judge evaluates each result. It only checks whether the agent was blocked, not whether it completed the task. Page load failures count as blocks.

Browser Use validated the difficulty with two baselines: headless Chromium scored 2%, headful Chromium scored 50%. The tasks are hard.

Results

Stealth Bench V1: Pass Rate by Provider

80 anti-bot protected websites. Higher is better.

ProviderScorePass Rate
Spider Cloud68/8085%
Browser Use Cloud64/8081%
Anchor59/8074%
Onkernel54/8068%
Browserless45/8056%
Local Headful39/8049%
Steel35/8044%
Hyperbrowser35/8044%
Browserbase33/8041%
Local Headless2/803%

Spider Browser came in 4 points ahead of Browser Use Cloud. We ran the benchmark multiple times and the gap held steady.

Category breakdown

The benchmark groups sites by anti-bot vendor. Here is how the top providers perform across each category:

Bypass Rate by Anti-Bot Vendor

Top 5 providers compared across each anti-bot category. Outer edge = 100%.

Head-to-Head by Anti-Bot Vendor

Bypass rate per vendor for the top 5 providers. Green bars mark the category leader.

ProviderCloudflarePerimeterXDatadomeAkamaireCaptchaOthers
Spider Cloud
96%
83%
77%
62%
100%
67%
Browser Use
93%
81%
69%
85%
80%
40%
Anchor
78%
81%
77%
69%
83%
50%
Onkernel
81%
31%
62%
100%
89%
67%
Browserless
59%
28%
69%
67%
83%
11%

A few results we were happy to see: 96% against Cloudflare (the most common anti-bot on the web) and 100% against reCaptcha. Spider Browser scored highest overall and led in four of six categories.

PerimeterX and Datadome remain tough across the board for everyone. Spider came in at 83% and 77%, which is solid but leaves room to improve.

How we ran it

We forked Browser Use’s benchmark repo, added spider-cloud as a browser provider using the spider-browser SDK, and ran the evaluation script unchanged.

opts = SpiderBrowserOptions(
    api_key=os.environ["SPIDER_API_KEY"],
    server_url="wss://browser.spider.cloud",
    captcha="solve",
    smart_retry=True,
    stealth=0,
    max_stealth_levels=3,
    hedge=True,
    mode="scraping",
)

That’s the full configuration. No manual retry logic, no per-site tuning, no special browser flags. The SDK handles detection and escalation automatically. When a site serves a challenge, Spider Browser detects it and works through it without any code on your end.

All 80 tasks completed in under 10 minutes total.

What makes Spider Browser different

Spider Browser is not a wrapper around someone else’s browser. It is a custom browser, written in Rust, with Chromium, Firefox, and Safari engines built into a single runtime. Each engine is tuned to look and behave like a real person on a real device. You call one API. The browser decides what engine, profile, and approach to use for each request, and adapts on the fly if something gets blocked.

We think the 85% score reflects that design. It is not one technique doing the heavy lifting. It is a browser built from day one to handle whatever the web throws at it.

Not just stealth

Stealth gets you past the front door. What happens after matters just as much.

In our previous benchmark, we tested 999 URLs across 254 domains and achieved a 100% pass rate with a 2.5s median end-to-end latency. That test covered everything from simple static sites to aggressive WAFs, with full content extraction and screenshots.

Spider Browser gives you the complete pipeline: stealth, extraction, screenshots, structured data, and AI-powered interactions, all through a single SDK.

import { SpiderBrowser } from "spider-browser";

const spider = new SpiderBrowser({
  apiKey: process.env.SPIDER_API_KEY,
  stealth: 0,
});

await spider.init();
await spider.page.goto("https://protected-site.com");

// Structured extraction from any page
const data = await spider.page.extractFields({
  title: "h1",
  price: ".price",
  description: ".product-desc",
});

Reproduce it

Everything is open source. Clone the repo, set your API key, run the benchmark.

git clone https://github.com/spider-rs/benchmark.git
cd benchmark
pip install uv && uv sync

# Decrypt the task set
python -c "
import base64, hashlib, json
from cryptography.fernet import Fernet
key = base64.urlsafe_b64encode(hashlib.sha256(b'Stealth_Bench_V1').digest())
tasks = json.loads(Fernet(key).decrypt(base64.b64decode(open('Stealth_Bench_V1.enc').read())))
json.dump(tasks, open('Stealth_Bench_V1.json', 'w'), indent=2)
"

# Run with Spider Browser
SPIDER_API_KEY=your-key uv run python run_stealth_eval.py

Results land in stealth_bench/official_results/. Compare against every other provider’s results already committed to the repo.

What this means

Credit to Browser Use for building and open-sourcing this benchmark. The web scraping space needs more transparent, reproducible comparisons, and Stealth Bench V1 is a strong step in that direction.

We are proud of the 85% result, but we also know anti-bot detection is a moving target. These scores will shift as vendors update their systems. We plan to keep running this benchmark and publishing results openly, because that is how trust gets built.

If you want to try Spider Browser, head over to spider.cloud. Set your API key, point it at any URL, and get content back.

Empower any project with
AI-ready data

Join thousands of developers using Spider to power their data pipelines.