# Spider > The fastest web crawling and scraping API for AI agents, RAG pipelines, and LLMs. > Crawl 100K+ pages per second. Get clean markdown, structured JSON, and AI-ready data. - Website: https://spider.cloud - API Base URL: https://api.spider.cloud - Documentation: https://spider.cloud/docs/overview - API Reference: https://spider.cloud/docs/api - OpenAPI Spec: https://spider.cloud/openapi.yaml - Pricing: https://spider.cloud/credits/new - MCP Server: https://www.npmjs.com/package/spider-cloud-mcp (npx spider-cloud-mcp) - Open Source: https://github.com/spider-rs/spider (MIT License) - Discord: https://discord.spider.cloud - Support: support@spider.cloud ## What Spider Does Spider is a web crawling and scraping API built for developers building AI applications. It converts any website into clean, structured data optimized for LLMs, RAG pipelines, and AI agent workflows. Key capabilities: - Crawl entire websites at 100K+ pages per second - Output formats: Markdown, HTML, JSON, JSONL, CSV, XML, plain text - JavaScript rendering with headless Chrome and Firefox - AI-powered natural language extraction (no CSS selectors needed) - Real-time web search API - Screenshot capture - Anti-bot bypass and proxy rotation - Webhook and cron scheduling - Pay-per-use pricing (no subscriptions required) ## Getting Started - Quickstart: https://spider.cloud/docs/quickstart - Get API Key: https://spider.cloud/api-keys - Interactive Playground: https://spider.cloud/playground - Concepts: https://spider.cloud/docs/concepts ## API Endpoints ### Core Endpoints - POST /crawl - Crawl websites and extract content from multiple pages - POST /scrape - Scrape a single page - POST /search - Search the web, optionally crawl results - POST /links - Extract all links from a page - POST /screenshot - Capture page screenshots - POST /unblocker - Access bot-protected content - POST /transform - Convert HTML to markdown/text/other formats - GET /data/credits - Check available credits ### AI Endpoints (Subscription Required) Natural language web data extraction — describe what you need in plain English: - POST /ai/crawl - AI-guided crawling with natural language prompts - POST /ai/scrape - Extract structured data using plain English - POST /ai/search - AI-enhanced semantic web search - POST /ai/browser - Automate browser interactions with natural language - POST /ai/links - Intelligent link extraction and filtering AI pricing: https://spider.cloud/ai/pricing ## Authentication All requests require a Bearer token: ``` Authorization: Bearer YOUR_API_KEY ``` ## SDKs & Libraries - Python: https://spider.cloud/docs/libraries - Node.js/JavaScript: https://spider.cloud/docs/libraries - Rust: https://spider.cloud/docs/libraries - Go: https://spider.cloud/docs/libraries ## Integrations Spider integrates with popular AI frameworks: - LangChain: https://spider.cloud/docs/integrations/langchain - LlamaIndex: https://spider.cloud/docs/integrations/llamaindex - CrewAI: https://spider.cloud/docs/integrations/crewai - Agno: https://spider.cloud/docs/integrations/agno - FlowiseAI: https://spider.cloud/docs/integrations/flowiseai - Zapier: https://spider.cloud/docs/integrations/zapier - x402 (Crypto Payments): https://spider.cloud/docs/integrations/x402 ## Use Cases - RAG Pipelines: https://spider.cloud/use-cases/rag - AI Training Data: https://spider.cloud/use-cases/ai-training - Price Monitoring: https://spider.cloud/use-cases/price-monitoring - SEO Tracking: https://spider.cloud/use-cases/seo-tracking - Content Aggregation: https://spider.cloud/use-cases/content-aggregation - Lead Generation: https://spider.cloud/use-cases/lead-generation - Market Research: https://spider.cloud/use-cases/market-research - Website Archiving: https://spider.cloud/use-cases/website-archiving ## Guides - Scraping & Crawling: https://spider.cloud/docs/core/scraping-crawling - Real-time Search: https://spider.cloud/docs/core/realtime-search - Efficient Scraping: https://spider.cloud/docs/core/efficient-scraping - Concurrent Streaming: https://spider.cloud/docs/core/concurrent-streaming - JSON Scraping: https://spider.cloud/docs/advanced/json-scraping - Webhooks: https://spider.cloud/docs/core/webhooks - Data Connectors: https://spider.cloud/docs/core/data-connectors - Error Codes: https://spider.cloud/docs/core/error-codes - Rate Limits: https://spider.cloud/docs/core/rate-limits - Use Case Guides: https://spider.cloud/docs/guides/use-cases - Recipes: https://spider.cloud/docs/guides/recipes ## Quick Example ```python import requests response = requests.post( "https://api.spider.cloud/crawl", headers={"Authorization": "Bearer YOUR_API_KEY"}, json={ "url": "https://example.com", "limit": 10, "return_format": "markdown" } ) for page in response.json(): print(page["url"], len(page["content"]), "chars") ``` ```javascript const response = await fetch("https://api.spider.cloud/crawl", { method: "POST", headers: { "Authorization": "Bearer YOUR_API_KEY", "Content-Type": "application/json" }, body: JSON.stringify({ url: "https://example.com", limit: 10, return_format: "markdown" }) }); const pages = await response.json(); ``` ```bash curl -X POST https://api.spider.cloud/crawl \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{"url": "https://example.com", "limit": 10, "return_format": "markdown"}' ```