One URL in. Structured content out.
Spider renders JavaScript, handles common anti-bot challenges, and returns markdown, HTML, plain text, or a JSON object you define with CSS selectors.
[
{
"url": "https://example.com/pricing",
"status": 200,
"content": "# Pricing\n\n## Starter\n$29/mo...",
"metadata": {
"title": "Pricing - Example",
"description": "Plans and pricing"
}
}
]Target exactly what you need.
Define a css_extraction_map and get a structured object back. No DOM parsing in your application code.
{
"url": "https://store.example.com/product/123",
"css_extraction_map": {
"name": "h1.product-title",
"price": ".price-value",
"desc": ".product-description",
"image": "img.hero-image@src"
}
}{
"css_extracted": {
"name": "Wireless Headphones Pro",
"price": "$149.99",
"desc": "Premium noise-cancelling...",
"image": "https://cdn.example.com/hp.jpg"
}
}Selectors, schemas, and JS hooks.
CSS & XPath extraction
Define a css_extraction_map to pull specific elements. Extract prices from product pages, headlines from articles, or any targeted data using selectors you already know.
JavaScript rendering
Modern websites render content with JavaScript. Spider's chrome and smart request modes ensure you see the same content as a real browser.
Batch URLs
Pass multiple URLs in a single request by comma-separating them or sending an array of objects. Reduce round-trips and latency when you have a list of known pages.
Readability mode
Enable readability to strip away navigation, sidebars, footers, and ads. Get only the main article or body content, ideal for content analysis and NLP tasks.
JSON data extraction
Extract structured JSON-LD, server-rendered data, and JavaScript-embedded objects from pages with return_json_data.
Custom JavaScript
Run your own JavaScript on the page before extraction with evaluate_on_new_document. Click buttons, dismiss modals, or transform the DOM.
cURL, Python, Node.
from spider import Spider
client = Spider()
result = client.scrape(
"https://example.com/pricing",
params={
"return_format": "markdown",
"metadata": True,
},
)
print(result[0]["content"])What you get back.
Standard response
urlThe final URL after any redirectscontentPage content in your chosen formatstatusHTTP status code of the response
Optional fields
metadataTitle, description, keywordslinksAll links found on the pageheadersHTTP response headerscookiesCookies set by the page
Where teams reach for it.
Price monitoring
Track product prices across e-commerce sites in real time. Use CSS selectors to target price elements and build automated pricing dashboards.
Content feeds
Pull the latest articles, headlines, or blog posts from known URLs into your application. Readability mode delivers clean text ready for display.
Data enrichment
Enrich your CRM or database records by scraping company websites for metadata, descriptions, and contact details from known profile URLs.
API replacement
When a website lacks an API, use Scrape with CSS selectors to create your own structured data feed from any public web page.
More from the API.
Extract data from any page.
Stop wrestling with browser automation. Get clean, structured content with a single API call.