Recipes
Copy-paste code for common Spider API tasks. Each recipe is complete and runnable. For full parameter details, see the API reference. For real-world applications, see Use Cases.
Crawl a Website
Crawl one or many pages from a URL. Use limit to cap the number of pages and depth to control how many link hops from the start URL. Set request: "smart" to let Spider choose the fastest strategy per page.
Python
cURL
Extract Structured Data
Use css_extraction_map to pull named fields from pages using CSS selectors. Map URL path patterns to arrays of selectors, and Spider returns the matched content as structured key-value pairs. For AI-powered extraction with JSON Schema, see AI Studio.
Python
cURL
Capture a Screenshot
Take a full-page screenshot of any URL. The API returns a base64-encoded PNG that you can save directly to a file.
Python
cURL
Search the Web
Query search engines and optionally fetch the content of each result. Set fetch_page_content: true to get the full page content alongside search metadata.
Python
cURL
Stream Results in Real-Time
Process pages as they finish crawling instead of waiting for the entire job. Set the Content-Type header to application/jsonl and read the response as a stream of newline-delimited JSON. See Concurrent Streaming for details.
Python
Node.js
Discover All Links
Map every link on a site without downloading page content. The /links endpoint returns URLs and their HTTP status codes, useful for sitemaps, SEO audits, and finding broken links.
Python
cURL
Automate Browser Interactions
Use automation_scripts to click buttons, fill forms, and navigate before extracting content. Each action targets a URL path pattern. Spider runs the steps in order when that path is crawled. Requires request: "chrome".
Python - Login and scrape
Python - Infinite scroll + load more
Target Content with CSS Selectors
Use root_selector to extract only the content you want, and exclude_selector to strip out noise like navbars, footers, and ads. Works with any return format.
Python
cURL
Transform HTML to Markdown
The /crawl endpoint already converts pages to markdown when you set return_format: "markdown". If you already have raw HTML on hand, use the /transform endpoint to convert it without crawling.
Python
cURL
Deliver Results Asynchronously
For large crawls, send results to a webhook or pipe them directly into cloud storage with data connectors. Both fire as each page finishes. No polling needed.
Python - Webhook delivery
Python - S3 data connector