NEW AI Studio is now available Try it now
POST /fetch/{domain}/{path} Alpha

Fetch API

Per-website scraper endpoints that auto-configure themselves. Point the API at any domain and it discovers the optimal CSS selectors, extraction schema, and request settings using AI. Configs are cached and reused so subsequent requests are fast and consistent.

How It Works

1

Request any domain

POST to /fetch/example.com/path with your API key. No scraper configuration needed.

2

AI discovers config

On first request, AI analyzes the page and discovers optimal CSS selectors, extraction schemas, and request settings like stealth mode or scrolling.

3

Cached & reused

Config is validated, cached in memory and database, and reused for all subsequent requests. Fast, reliable, and consistent.

Fetch vs. Scrape: When to Use Which

Use Fetch When

  • You want structured data without writing CSS selectors yourself
  • You're scraping a website repeatedly and want cached, consistent configs
  • You need AI to figure out the best extraction approach for a site
  • You want to leverage community-validated scraper configs

Use Scrape When

  • You already know the exact CSS selectors you need
  • You want full control over extraction settings per request
  • You need raw markdown, HTML, or text output rather than structured JSON
  • You're doing a one-off extraction where caching doesn't matter

Endpoint Reference

POST /fetch/{domain}/{path}

Fetch structured data from a specific domain and path. The domain is extracted from the URL path, not the request body.

URL Parameters

domain The target website domain (e.g., news.ycombinator.com)
path The page path to scrape (e.g., /newest or /)

Headers

Authorization Bearer YOUR_API_KEY (required)
Content-Type application/json

Body Parameters (all optional)

AI handles extraction automatically. These parameters only control output format and crawl behavior.

return_format Output format: json (default), markdown, html, text.
limit Number of pages to crawl (default 1). Values > 1 enable multi-page crawling, capped at 100.
readability Strip navigation, ads, and sidebars. Returns main content only.

Code Examples

Fetch structured data from Hacker News cURL
curl -X POST https://api.spider.cloud/fetch/news.ycombinator.com/ \
  -H "Authorization: Bearer $SPIDER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"return_format": "json"}'
Fetch with Python requests Python
import os, requests

response = requests.post(
    "https://api.spider.cloud/fetch/news.ycombinator.com/",
    headers={
        "Authorization": "Bearer " + os.environ["SPIDER_API_KEY"],
        "Content-Type": "application/json",
    },
    json={"return_format": "json"}
)

data = response.json()
print(data)
Fetch with JavaScript JavaScript
const SPIDER_API_KEY = process.env.SPIDER_API_KEY;

const response = await fetch(
  "https://api.spider.cloud/fetch/news.ycombinator.com/",
  {
    method: "POST",
    headers: {
      "Authorization": "Bearer " + SPIDER_API_KEY,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ return_format: "json" }),
  }
);

const data = await response.json();
console.log(data);
Crawl multiple pages from a site cURL
curl -X POST https://api.spider.cloud/fetch/example.com/products \
  -H "Authorization: Bearer $SPIDER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"limit": 10}'

Config Resolution

When you make a request to /fetch/{domain}/{path}, Spider resolves the scraper config through a three-layer lookup:

1

In-memory cache

Checks your user-specific config first, then falls back to the shared/public config. Sub-millisecond response.

2

Database lookup

Queries the config database for a matching domain + path. Configs from previous requests or other users are found here.

3

AI discovery

If no config exists, AI crawls the page, analyzes its structure, and generates an optimal scraper config. This is slower (a few seconds) but only happens once per domain/path.

What Gets Auto-Configured

CSS Selectors

AI discovers the right selectors for titles, prices, descriptions, images, and other structured fields on each page type.

Request Mode

Determines whether a page needs JavaScript rendering (chrome), stealth mode, or works with plain HTTP.

Scroll & Wait

Detects lazy-loaded content that requires scrolling or waiting for specific elements to appear before extraction.

Extraction Schema

Generates a JSON schema describing the structured data that can be extracted from the page.

Content Filtering

Sets root selectors to focus on main content and exclude selectors to skip ads, navigation, and footers.

Confidence Scoring

Each config gets a confidence score. Configs are validated over time and improved as more users access the same endpoint.

Response

Standard Response

  • url The final URL after any redirects
  • content Extracted data in your chosen format
  • status HTTP status code of the response
  • metadata Page title, description, keywords, og:image

With JSON Format

  • css_extracted Structured data from AI-discovered selectors
  • links All links found on the page
  • headers HTTP response headers (if requested)

Fetch Directory

Browse pre-configured scraper endpoints discovered by the community.

Browse All

Every time someone uses the Fetch API on a new domain, the AI-discovered config is validated and made available in the public directory. You can browse available scrapers, see what fields they extract, and use them instantly.

Related Resources

Build scrapers without writing selectors

Point the Fetch API at any website and get structured data back. AI handles the configuration so you can focus on using the data.