NEW AI Studio is now available Try it now
POST /scrape

Web Scraping API

Extract content from any web page in a single request. Spider handles JavaScript rendering, anti-bot challenges, and content extraction so you get clean, structured data without managing browser infrastructure.

Scrape vs. Crawl: When to Use Which

Use Scrape When

  • You know the exact URL(s) you want
  • You need the fastest possible response time
  • You're building a real-time data pipeline for specific pages
  • You want to target precise elements with CSS selectors

Use Crawl When

  • You need to discover and collect multiple pages from a site
  • You want to follow links and explore a domain recursively
  • You're building a dataset from an entire website
  • You don't know all the URLs ahead of time

Key Capabilities

CSS & XPath Extraction

Define a css_extraction_map to pull specific elements. Extract prices from product pages, headlines from articles, or any targeted data using selectors you already know.

JavaScript Rendering

Modern websites render content with JavaScript. Spider's chrome and smart request modes ensure you see the same content as a real browser.

Batch URLs

Pass multiple URLs in a single request by comma-separating them or sending an array of objects. Reduce round-trips and latency when you have a list of known pages.

Readability Mode

Enable readability to strip away navigation, sidebars, footers, and ads. Get only the main article or body content, ideal for content analysis and NLP tasks.

JSON Data Extraction

Extract structured JSON-LD, server-rendered data, and JavaScript-embedded objects from pages with return_json_data.

Custom JavaScript

Run your own JavaScript on the page before extraction with evaluate_on_new_document. Click buttons, dismiss modals, or transform the DOM.

Code Examples

Scrape a page for markdown content Python
from spider import Spider

client = Spider()

result = client.scrape(
    "https://example.com/pricing",
    params={
        "return_format": "markdown",
        "metadata": True,
    }
)

print(result[0]["content"])
Extract specific elements with CSS selectors cURL
curl -X POST https://api.spider.cloud/scrape \
  -H "Authorization: Bearer $SPIDER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/products",
    "css_extraction_map": {
      "title": "h1.product-title",
      "price": ".price-value",
      "description": ".product-description"
    }
  }'
Batch scrape multiple URLs JavaScript
import Spider from "@spider-cloud/spider-client";

const client = new Spider();

// Comma-separated URLs in a single request
const pages = await client.scrape(
  "https://example.com/page-1,https://example.com/page-2",
  {
    return_format: "text",
    return_headers: true,
  }
);

What You Get Back

Standard Response

  • url The final URL after any redirects
  • content Page content in your chosen format
  • status HTTP status code of the response

With Optional Fields

  • metadata Title, description, keywords
  • links All links found on the page
  • headers HTTP response headers
  • cookies Cookies set by the page

Related Resources

Extract data from any page

Stop wrestling with browser automation. Get clean, structured content with a single API call.