API Features
A complete toolkit for web data collection. Crawl, scrape, search, screenshot, transform, and extract, all from a single API.
Crawl
POST /crawl Recursively crawl entire websites and collect every page. Set depth limits, respect robots.txt, and get structured output in markdown, HTML, or plain text.
Scrape
POST /scrape Extract content from individual pages with precision. Optimized for single-page extraction with CSS selectors, metadata, and multiple output formats.
Search
POST /search Perform search engine queries and automatically crawl the results. Combine search discovery with content extraction in a single step.
Screenshot
POST /screenshot Capture high-quality screenshots of any web page. Full-page or viewport captures returned as base64 or binary with configurable format and quality.
Transform
POST /transform Convert raw HTML into clean markdown, plain text, or sanitized HTML. Process content offline without re-fetching pages from the web.
Unblocker
POST /unblocker Access content behind anti-bot protections and challenging security measures. Advanced fingerprinting and session management for protected sites.
AI Extraction
POST /pipeline/* Extract structured data using AI-powered pipelines. Pull contacts, generate Q&A pairs, label websites, and filter links with built-in intelligence.
Links
POST /links Collect all links from a website without extracting page content. Optimized for sitemap generation, link analysis, and URL discovery at lower cost.
Fetch (Alpha)
POST /fetch/{domain}/{path} Per-website fetch APIs with AI-discovered configurations. Configs are discovered once, validated, cached, and reused. Browse available endpoints in the directory.
Shared Across All Endpoints
Proxy Support
Residential, mobile, and ISP proxies with geo-routing to 100+ countries
Streaming
JSONL streaming responses so you can process results as they arrive
Caching
Built-in HTTP caching with configurable TTL to reduce redundant requests
Webhooks
Async delivery of results to your endpoint when crawls complete
Rate Limiting
Automatic concurrency management to respect target site limits
Auth Header
Simple Bearer token authentication on every request
Multi-Format
Response as JSON, XML, CSV, or JSONL depending on your needs
Client Libraries
Official SDKs for Python, JavaScript, Rust, and CLI access
Start building with Spider
Get your API key and make your first request now.