AI API Reference

Spider's AI endpoints enhance standard web data extraction with natural language understanding. All endpoints accept a prompt parameter that guides AI-powered extraction and automation.

Base URL:https://api.spider.cloud
POST/ai/crawl

AI Crawl

Crawl websites intelligently using natural language prompts. The AI analyzes your prompt to determine crawl depth, page filtering, and content extraction strategies.

Parameters

NameTypeRequiredDescription
urlstringrequiredStarting URL to crawl
promptstringrequiredNatural language instruction for what to crawl and extract
limitnumberoptionalMaximum pages to crawl
return_formatstringoptionalOutput format: markdown, html, text, or raw
extraction_schemaobjectoptionalJSON Schema for structured extraction with name, description, and schema fields
metadatabooleanoptionalInclude metadata with extracted_data in response

Example Request

cURL

curl -X POST https://api.spider.cloud/ai/crawl \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "url": "https://example.com",
  "prompt": "Find all blog posts and extract titles and summaries",
  "limit": 50,
  "return_format": "markdown"
}'

Python

import requests

response = requests.post(
    "https://api.spider.cloud/ai/crawl",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "url": "https://example.com",
        "prompt": "Find all blog posts and extract titles and summaries",
        "limit": 50,
        "return_format": "markdown"
    }
)

print(response.json())

Example Response

{
  "status": 200,
  "data": [
    {
      "url": "https://example.com/blog/post-1",
      "title": "Getting Started with AI",
      "summary": "An introduction to artificial intelligence..."
    }
  ],
  "ai_cost": 0.002
}
POST/ai/scrape

AI Scrape

Extract structured data from any webpage using plain English prompts. AI automatically identifies and extracts the data you describe. Use extraction_schema for typed JSON output.

Parameters

NameTypeRequiredDescription
urlstringrequiredURL to scrape
promptstringrequiredNatural language description of data to extract
return_formatstringoptionalOutput format: json, markdown, raw, html, text
extraction_schemaobjectoptionalJSON Schema for structured extraction with name, description, and schema fields
metadatabooleanoptionalInclude metadata with extracted_data in response
requeststringoptionalRequest type: http or chrome for JavaScript rendering

Example Request

cURL

curl -X POST https://api.spider.cloud/ai/scrape \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "url": "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
  "prompt": "Extract book details",
  "return_format": "raw",
  "extraction_schema": {
    "name": "BookDetails",
    "description": "Product information from a book listing",
    "schema": "{\"type\":\"object\",\"properties\":{\"title\":{\"type\":\"string\"},\"price\":{\"type\":\"string\"},\"availability\":{\"type\":\"string\"},\"description\":{\"type\":\"string\"}},\"required\":[\"title\",\"price\"]}"
  },
  "metadata": true,
  "request": "chrome"
}'

Python

import requests

response = requests.post(
    "https://api.spider.cloud/ai/scrape",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "url": "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
        "prompt": "Extract book details",
        "return_format": "raw",
        "extraction_schema": {
            "name": "BookDetails",
            "description": "Product information from a book listing",
            "schema": "{\"type\":\"object\",\"properties\":{\"title\":{\"type\":\"string\"},\"price\":{\"type\":\"string\"},\"availability\":{\"type\":\"string\"},\"description\":{\"type\":\"string\"}},\"required\":[\"title\",\"price\"]}"
        },
        "metadata": true,
        "request": "chrome"
    }
)

print(response.json())

Example Response

{
  "status": 200,
  "content": "...",
  "metadata": {
    "extracted_data": {
      "title": "A Light in the Attic",
      "price": "£51.77",
      "availability": "In stock (22 available)",
      "description": "It's hard to imagine a world without A Light in the Attic..."
    }
  },
  "costs": {
    "ai_cost": 0.001,
    "total_cost": 0.002
  }
}
POST/ai/browser

AI Browser

Automate browser interactions using natural language. Describe actions in plain English and AI configures the automation.

Parameters

NameTypeRequiredDescription
urlstringrequiredURL to automate
promptstringrequiredNatural language description of browser actions
wait_fornumberoptionalWait time between actions in ms
screenshotbooleanoptionalCapture screenshot after actions

Example Request

cURL

curl -X POST https://api.spider.cloud/ai/browser \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "url": "https://example.com/login",
  "prompt": "Click the sign in button, wait for the form, fill email with test@example.com",
  "wait_for": 2000
}'

Python

import requests

response = requests.post(
    "https://api.spider.cloud/ai/browser",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "url": "https://example.com/login",
        "prompt": "Click the sign in button, wait for the form, fill email with test@example.com",
        "wait_for": 2000
    }
)

print(response.json())

Example Response

{
  "status": 200,
  "data": {
    "success": true,
    "actions_performed": [
      "clicked: sign in button",
      "waited: 1000ms",
      "filled: email field"
    ],
    "screenshot": "base64..."
  },
  "ai_cost": 0.005
}