Skip to main content New gottem — one API for every web scraping vendor.
Search API POST /search

Search query in. Crawled pages out.

Spider runs the query against the major search engines, walks the top results, and returns extracted content. One call, one bill, one JSON shape.

RAG best practices for LLMs
docs.anthropic.com/claude/rag
Anthropic: RAG with Claude
platform.openai.com/docs/guides
OpenAI: Retrieval Guide
langchain.com/docs/rag
LangChain: RAG Tutorial
huggingface.co/docs/rag
HuggingFace: RAG Pipeline
3 / 4 results crawled
Fan-out

One query, many results.

A single search request fans out across the web and converges back to structured content.

API call 1
Results discovered N
Pages extracted N
How it works

Four steps from query to data.

01

You send a query

A plain-text search query — the same kind you'd type into a search engine. Set search_limit to control how many results to crawl.

02

Spider searches the web

Queries major search engines and collects the top result URLs. Use country_code for geo-targeted results.

03

Each result is crawled

Every discovered page is loaded, rendered with a real browser if needed, and its content extracted in your chosen format.

04

Clean content returned

Structured content from every result page delivered as markdown, text, HTML, or raw bytes — ready for your pipeline.

Capabilities

What you can configure.

Configurable result count

Set search_limit to control how many search results Spider crawls — from 1 to dozens. Balance thoroughness against cost and latency.

Deep crawl results

Combine search_limit with limit to not just scrape result pages but crawl deeper into each discovered site.

Geo-targeted search

Use country_code to get localized results. See what users in different regions find for the same query.

All output formats

Get search-sourced content as markdown, text, HTML, or raw bytes — the same formatting and cleaning as crawl and scrape.

Metadata enrichment

Enable metadata to get page titles, descriptions, and keywords alongside extracted content for building search indexes.

Streaming support

Use JSONL content type to stream results as each page is crawled and processed. Start consuming data immediately.

Examples

cURL, Python, Node.

from spider import Spider

client = Spider()

# Search for a topic and get content from top results
results = client.search(
    "RAG best practices for LLMs",
    params={
        "search_limit": 5,
        "return_format": "markdown",
        "metadata": True,
    },
)

for page in results:
    print(f"{page['url']}: {page['metadata']['title']}")
Try it

Run a real search.

Type any search query. Spider will search the web, crawl the top results, and return clean content. No sign-up required.

Type a query and hit Search to see live results from Spider's Search API.

No sign-in required. 25 free searches per day.

Why Spider Search

Built for production AI workloads.

10K req/min

No rate limit walls

Most search APIs cap at 100-500 concurrent requests. Spider handles 10,000 per minute. Your agent keeps working during traffic spikes.

N:1 batch

Batch multiple queries

Send an array of search queries in a single API call. Spider runs them all in parallel and returns grouped results. One round trip instead of N.

Auto paginate

Auto-pagination

Set search_limit to 50 and auto_pagination to true. Spider pages through search results automatically. No manual page incrementing.

Comparison

How Spider's Search compares.

Stacked against alternatives for AI and LLM use cases.

FeatureSpiderFirecrawlJinaSerpAPI
Search + scrapeOne callOne callSnippets onlySearch only
Batch queriesYesNoNoNo
Auto-paginationYesNoNoManual
Rate limit10K/min2-150500 RPM5K/mo
Geo-targetingCity-levelLimitedNoCountry
Time filters5 levelsYesNoYes
Output formats4+MarkdownMarkdownJSON
Avg cost/search~$0.003~$0.01Free*$0.01

*Jina free tier is rate-limited to 500 RPM with 10M token cap.

AI search

Describe what you need in plain English.

Spider's AI models optimize your query, search the web, extract relevant content, and return structured results. No manual parameter tuning needed.

POST /ai/search
# Natural language search
curl -X POST https://api.spider.cloud/ai/search \
  -H "Authorization: Bearer $KEY" \
  -d '{
    "prompt": "Find recent benchmarks comparing RAG frameworks",
    "search_limit": 5,
    "return_format": "markdown"
  }'
Query optimization AI rewrites your prompt into effective search queries.
Smart extraction Vision models pull the most relevant content from results.
Model selection Choose Qwen, Schematron, or bring your own via OpenRouter.
Use cases

Where teams reach for it.

01

RAG with live web data

Feed a user's question into the search API, retrieve relevant pages, and pass the content to an LLM for grounded, up-to-date answers. No need to maintain a static corpus.

02

Market research

Search for competitor names, product categories, or industry terms and automatically collect the latest information from multiple sources in a single request.

03

Content curation

Build automated pipelines that discover and aggregate the best articles on specific topics for newsletters, research reports, or knowledge bases.

04

SERP monitoring

Track how search results change over time for keywords that matter to your business. Compare results across regions using country_code targeting.

Related

More from the API.

Get started

Search the web programmatically.

Combine search discovery with content extraction in a single API call.