Fetch API
Per-website scraper endpoints that auto-configure themselves. Point the API at any domain and it discovers the optimal CSS selectors, extraction schema, and request settings using AI. Configs are cached and reused so subsequent requests are fast and consistent.
How It Works
Request any domain
POST to /fetch/example.com/path with your API key. No scraper configuration needed.
AI discovers config
On first request, AI analyzes the page and discovers optimal CSS selectors, extraction schemas, and request settings like stealth mode or scrolling.
Cached & reused
Config is validated, cached in memory and database, and reused for all subsequent requests. Fast, reliable, and consistent.
Fetch vs. Scrape: When to Use Which
Use Fetch When
- • You want structured data without writing CSS selectors yourself
- • You're scraping a website repeatedly and want cached, consistent configs
- • You need AI to figure out the best extraction approach for a site
- • You want to leverage community-validated scraper configs
Use Scrape When
- • You already know the exact CSS selectors you need
- • You want full control over extraction settings per request
- • You need raw markdown, HTML, or text output rather than structured JSON
- • You're doing a one-off extraction where caching doesn't matter
Endpoint Reference
/fetch/{domain}/{path} Fetch structured data from a specific domain and path. The domain is extracted from the URL path, not the request body.
URL Parameters
domain The target website domain (e.g., news.ycombinator.com) path The page path to scrape (e.g., /newest or /) Headers
Authorization Bearer YOUR_API_KEY (required) Content-Type application/json Body Parameters (all optional)
AI handles extraction automatically. These parameters only control output format and crawl behavior.
return_format Output format: json (default), markdown, html, text. limit Number of pages to crawl (default 1). Values > 1 enable multi-page crawling, capped at 100. readability Strip navigation, ads, and sidebars. Returns main content only. Code Examples
curl -X POST https://api.spider.cloud/fetch/news.ycombinator.com/ \
-H "Authorization: Bearer $SPIDER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"return_format": "json"}' import os, requests
response = requests.post(
"https://api.spider.cloud/fetch/news.ycombinator.com/",
headers={
"Authorization": "Bearer " + os.environ["SPIDER_API_KEY"],
"Content-Type": "application/json",
},
json={"return_format": "json"}
)
data = response.json()
print(data) const SPIDER_API_KEY = process.env.SPIDER_API_KEY;
const response = await fetch(
"https://api.spider.cloud/fetch/news.ycombinator.com/",
{
method: "POST",
headers: {
"Authorization": "Bearer " + SPIDER_API_KEY,
"Content-Type": "application/json",
},
body: JSON.stringify({ return_format: "json" }),
}
);
const data = await response.json();
console.log(data); curl -X POST https://api.spider.cloud/fetch/example.com/products \
-H "Authorization: Bearer $SPIDER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"limit": 10}' Config Resolution
When you make a request to /fetch/{domain}/{path}, Spider resolves the scraper config through a three-layer lookup:
In-memory cache
Checks your user-specific config first, then falls back to the shared/public config. Sub-millisecond response.
Database lookup
Queries the config database for a matching domain + path. Configs from previous requests or other users are found here.
AI discovery
If no config exists, AI crawls the page, analyzes its structure, and generates an optimal scraper config. This is slower (a few seconds) but only happens once per domain/path.
What Gets Auto-Configured
CSS Selectors
AI discovers the right selectors for titles, prices, descriptions, images, and other structured fields on each page type.
Request Mode
Determines whether a page needs JavaScript rendering (chrome), stealth mode, or works with plain HTTP.
Scroll & Wait
Detects lazy-loaded content that requires scrolling or waiting for specific elements to appear before extraction.
Extraction Schema
Generates a JSON schema describing the structured data that can be extracted from the page.
Content Filtering
Sets root selectors to focus on main content and exclude selectors to skip ads, navigation, and footers.
Confidence Scoring
Each config gets a confidence score. Configs are validated over time and improved as more users access the same endpoint.
Response
Standard Response
- url The final URL after any redirects
- content Extracted data in your chosen format
- status HTTP status code of the response
- metadata Page title, description, keywords, og:image
With JSON Format
- css_extracted Structured data from AI-discovered selectors
- links All links found on the page
- headers HTTP response headers (if requested)
Fetch Directory
Browse pre-configured scraper endpoints discovered by the community.
Every time someone uses the Fetch API on a new domain, the AI-discovered config is validated and made available in the public directory. You can browse available scrapers, see what fields they extract, and use them instantly.
Related Resources
Build scrapers without writing selectors
Point the Fetch API at any website and get structured data back. AI handles the configuration so you can focus on using the data.