Skip to main content gottem  — one API for every scraper.
Fetch API POST /fetch/{domain}/{path}

Per-website scrapers that build themselves.

First call: AI discovers selectors, render mode, and a schema for the page. Subsequent calls hit cache. Each domain becomes its own JSON endpoint.

Config discovery
$ POST /fetch/news.ycombinator.com/
⟳ no cached config found
⟳ AI analyzing page structure...
selectors: .titleline > a
mode: http (no JS needed)
schema: {title, url, score}
→ config cached · 30 items extracted
$ POST /fetch/news.ycombinator.com/newest
→ cached config · 0.2s · 30 items
Config resolution

Three-layer cache hierarchy.

Configs get faster the more they're used. The first request bootstraps; everything after hits cache.

L1

Memory cache

User-specific config first, then shared/public config. Sub-millisecond.

< 1ms
L2

Database lookup

Queries config database for matching domain + path. Includes community-discovered configs.

~50ms
L3

AI discovery

AI crawls the page, analyzes structure, and generates an optimal scraper config. Only happens once per domain/path.

~3-5s (first time)
Fetch vs Scrape

When to use which.

Use Fetch when

  • You want structured data without writing CSS selectors
  • You're scraping a site repeatedly and want cached configs
  • You need AI to figure out the best extraction approach
  • You want community-validated scraper configs

Use Scrape when

  • You already know the exact CSS selectors you need
  • You want full control over extraction settings
  • You need raw markdown, HTML, or text output
  • You're doing a one-off extraction
Auto-configured

What the AI figures out.

CSS selectors

AI discovers the right selectors for titles, prices, descriptions, images, and other structured fields on each page type.

Request mode

Determines whether a page needs JavaScript rendering (chrome), stealth mode, or works with plain HTTP.

Scroll & wait

Detects lazy-loaded content that requires scrolling or waiting for specific elements to appear before extraction.

Extraction schema

Generates a JSON schema describing the structured data that can be extracted from the page.

Content filtering

Sets root selectors for main content and exclude selectors to skip ads, navigation, and footers.

Confidence scoring

Each config gets a confidence score. Configs are validated over time and improved as more users access the same endpoint.

Endpoint reference

Parameters at a glance.

POST /fetch/{domain}/{path}

URL parameters

domain Target website domain (e.g. news.ycombinator.com)
path Page path to scrape (e.g. /newest or /)

Body parameters (all optional)

AI handles extraction automatically. These only control output format and crawl behavior.

return_format json (default), markdown, html, or text
limit Number of pages to crawl (default 1, max 100)
readability Strip navigation, ads, sidebars. Returns main content only.
Examples

cURL, Python, Node.

import os, requests

response = requests.post(
    "https://api.spider.cloud/fetch/news.ycombinator.com/",
    headers={
        "Authorization": "Bearer " + os.environ["SPIDER_API_KEY"],
        "Content-Type": "application/json",
    },
    json={"return_format": "json"},
)

data = response.json()
print(data)
Response

What you get back.

url The final URL after any redirects
content Extracted data in your chosen format
status HTTP status code of the response
metadata Page title, description, keywords, og:image
css_extracted Structured data from AI-discovered selectors (JSON format)
links All links found on the page
Directory

Browse pre-configured scrapers.

Every time someone uses the Fetch API on a new domain, the AI-discovered config is validated and made available in the public directory. Browse available scrapers, see what fields they extract, and use them instantly.

Browse all
Related

More from the API.

Get started

Build scrapers without writing selectors.

Point the Fetch API at any website and get structured data back. AI handles the configuration.