Skip to main content
NEW AI Studio is now available Try it now

Open Source vs Cloud

Same crawler. Less ops.

The core crawling engine is identical. The difference is everything around it: who manages the proxies, the browsers, the scaling, and the anti-bot logic.

Open Source
# you handle everything
cargo add spider
# set up proxy rotation
# configure headless Chrome
# manage server scaling
# deal with bans + retries
# write markdown cleaning
# build extraction pipeline
Spider Cloud
curl https://api.spider.cloud/crawl \
  -H "Authorization: Bearer $KEY" \
  -d '{
    "url": "https://example.com",
    "limit": 100
  }'

✓ proxies, stealth, scaling handled

What changes when you go managed

Scaling
You size and manage your own fleet
Elastic. 10 pages or 10 million, same API call
Proxies
Bring your own, rotate them yourself
Rotated per-request with real-time block detection
Anti-bot
Basic fingerprint randomization via libs
Full stealth engine, fingerprinting baked into every request
Browser
You run headless Chrome or Firefox
Custom Rust browser. Faster, lighter, built for scraping
Markdown
html2md library output
Per-site tuning that strips noise for cleaner LLM inputs
AI extraction
Not included
Send a schema, get structured JSON. No parsers to maintain

Self-host it

You want full control. You have your own proxies, your own servers, and a team that can maintain the pipeline. Or you need to run on-prem for compliance.

cargo add spider

Let us run it

You want data, not infrastructure. Sites are blocking you and you need proxy intelligence. Or you just want to ship faster and skip the ops work.

curl https://api.spider.cloud/crawl

Both paths use the same Rust crawling engine. The open source library is MIT-licensed and always will be. Spider Cloud is for when you want someone else to handle the hard parts.

Try Spider Cloud free

Free credits on signup. No credit card required.