AI API Reference

Spider's AI endpoints enhance standard web data extraction with natural language understanding. Each AI endpoint accepts all parameters from its corresponding standard endpoint (e.g., /ai/crawl accepts all /crawl params), plus AI-specific parameters like prompt and extraction_schema.

Base URLhttps://api.spider.cloud
POST/ai/scrape

AI Scrape

Extract structured data from any webpage using plain English prompts. Accepts all /scrape endpoint parameters plus AI-specific ones. AI automatically identifies and extracts the data you describe. Use extraction_schema for typed JSON output.

Parameters

NameTypeStatusDescription
urlstringrequiredURL to scrape
promptstringrequiredNatural language description of data to extract
return_formatstringoptionalOutput format: json, markdown, raw, html, text. Defaults to empty (only extracted data returned)
extraction_schemaobjectoptionalJSON Schema for structured extraction with name, description, and schema fields
metadatabooleanoptionalInclude metadata with extracted_data in response
requeststringoptionalRequest type: http or chrome for JavaScript rendering
cleaning_intent"extraction" | "action" | "general"optionalSmart HTML cleaning: extraction (aggressive), action (preserves interactivity), general (balanced)

Example Request

cURL
curl -X POST https://api.spider.cloud/ai/scrape \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "url": "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
  "prompt": "Extract book details",
  "extraction_schema": {
    "name": "BookDetails",
    "description": "Product information from a book listing",
    "schema": "{\"type\":\"object\",\"properties\":{\"title\":{\"type\":\"string\"},\"price\":{\"type\":\"string\"},\"availability\":{\"type\":\"string\"},\"description\":{\"type\":\"string\"}},\"required\":[\"title\",\"price\"]}"
  },
  "request": "chrome"
}'
Python
import requests

response = requests.post(
    "https://api.spider.cloud/ai/scrape",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "url": "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html",
        "prompt": "Extract book details",
        "extraction_schema": {
            "name": "BookDetails",
            "description": "Product information from a book listing",
            "schema": "{\"type\":\"object\",\"properties\":{\"title\":{\"type\":\"string\"},\"price\":{\"type\":\"string\"},\"availability\":{\"type\":\"string\"},\"description\":{\"type\":\"string\"}},\"required\":[\"title\",\"price\"]}"
        },
        "request": "chrome"
    }
)

print(response.json())

Example Response

[
  {
    "content": null,
    "costs": {
      "ai_cost": 0,
      "ai_cost_formatted": "0",
      "bytes_transferred_cost": 0.000009658,
      "compute_cost": 0.000006366,
      "total_cost": 0.000017,
      "total_cost_formatted": "0.000017"
    },
    "duration_elapsed_ms": 3824,
    "error": null,
    "metadata": {
      "extracted_data": {
        "title": "A Light in the Attic",
        "price": "£51.77",
        "availability": "In stock (22 available)",
        "upc": "a897fe39b1053632",
        "product_type": "Books"
      }
    },
    "status": 200,
    "url": "https://books.toscrape.com/catalogue/a-light-in-the-attic_1000/index.html"
  }
]
POST/ai/browser

AI Browser

Automate browser interactions using natural language. Accepts all browser automation parameters plus AI-specific ones. Describe actions in plain English and AI configures the automation.

Parameters

NameTypeStatusDescription
urlstringrequiredURL to automate
promptstringrequiredNatural language description of browser actions
wait_fornumberoptionalWait time between actions in ms
screenshotbooleanoptionalCapture screenshot after actions

Example Request

cURL
curl -X POST https://api.spider.cloud/ai/browser \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "url": "https://example.com/login",
  "prompt": "Click the sign in button, wait for the form, fill email with test@example.com",
  "wait_for": 2000
}'
Python
import requests

response = requests.post(
    "https://api.spider.cloud/ai/browser",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json"
    },
    json={
        "url": "https://example.com/login",
        "prompt": "Click the sign in button, wait for the form, fill email with test@example.com",
        "wait_for": 2000
    }
)

print(response.json())

Example Response

[
  {
    "content": "<html>...</html>",
    "costs": {
      "ai_cost": 0.005,
      "total_cost": 0.006
    },
    "duration_elapsed_ms": 5200,
    "error": null,
    "metadata": {
      "extracted_data": {
        "steps": [
          "clicked: sign in button",
          "waited: 1000ms",
          "filled: email field"
        ]
      },
      "screenshot": "base64..."
    },
    "status": 200,
    "url": "https://example.com/login"
  }
]