Scraping and crawling
/scrape fetches a single URL and returns its content. /crawl starts from a URL, follows internal links, and returns content for each page. Both share the same output formats, proxy settings, and request modes.
Scrape one URL
Send a URL to /scrape and pick a return_format (markdown for LLMs, raw for the original HTML). Call the API directly or use any of our SDK libraries.
Crawl from a URL
/crawl starts from a URL and follows internal links. limit caps how many pages come back; depth controls how many link-levels deep the crawler goes. Set a reasonable limit when testing — most sites have thousands of pages.
Request types
The request parameter controls how Spider fetches each page.
smartDefaulthttpFastchromeJS / SPAStreaming responses
Use streamingto process pages the moment they finish crawling instead of waiting for the entire result set. Set Content-Type: application/jsonl and read the response line by line.
Response fields
Each page in the response array carries these fields.
| Field | Type | Description |
|---|---|---|
url | string | The URL that was crawled or scraped. |
status | number | HTTP status code from the target page (200, 404, 500, …). |
content | string | Page content in the requested format — HTML, markdown, text, or base64 for screenshots. |
error | string | null | Error message if the page failed to load. null on success. |
costs | object | Cost breakdown for this request in USD. |
costs.total_cost | number | Total cost of this request. |
costs.total_cost_formatted | string | Human-readable formatted total. |
costs.ai_cost | number | AI processing cost — extraction, labeling, etc. |
costs.bytes_transferred_cost | number | Cost based on data transferred. |
costs.compute_cost | number | Cost of compute resources — browser rendering, etc. |
Handling errors
Some pages in a crawl will fail — 404s, timeouts, bot blocks. The API still returns an HTTP 200, but individual pages in the array may carry non-200 status values or an error field. Check per-page status before reading content. See Error Codesfor the full reference.