Search API
Go from a search query to fully extracted web content in a single request. Spider searches the web, discovers relevant pages, crawls each result, and returns clean content, combining discovery and extraction into one workflow.
How Search + Crawl Works
You Send a Query
A plain-text search query like you'd type into a search engine
Spider Searches
Queries search engines and collects the top result URLs
Results Are Crawled
Each result page is loaded, rendered, and its content extracted
Content Returned
Clean, structured content from each result delivered to you
The Traditional Way
- Use a SERP API to get search result URLs
- Use a separate scraping tool to fetch each URL
- Parse and clean each page's HTML independently
- Stitch together multiple API calls, handle failures
With Spider Search
- One API call does search + crawl + extraction
-
Control how many results to crawl with
search_limit - Same clean output formats as crawl and scrape
- Full proxy, caching, and rendering support built in
Key Capabilities
Configurable Result Count
Set search_limit to control how many search results Spider crawls, from 1 to dozens. Balance thoroughness against cost and latency.
Deep Crawl Results
Combine search_limit with limit to not just scrape result pages but crawl deeper into each discovered website.
Geo-Targeted Search
Use country_code to get localized search results. See what users in different regions find for the same query.
All Output Formats
Get search-sourced content as markdown, text, HTML, or raw bytes. Apply the same formatting and cleaning as the crawl and scrape endpoints.
Metadata Enrichment
Enable metadata to get page titles, descriptions, and keywords alongside extracted content. Useful for building search indexes or knowledge bases.
Streaming Support
Use JSONL content type to stream results as each search result page is crawled and processed. Start consuming data immediately.
Code Examples
from spider import Spider
client = Spider()
# Search for a topic and get content from top results
results = client.search(
"best practices for web accessibility",
params={
"search_limit": 5,
"return_format": "markdown",
"metadata": True,
}
)
for page in results:
print(f"{page['url']}: {page['metadata']['title']}") curl -X POST https://api.spider.cloud/search \
-H "Authorization: Bearer $SPIDER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"search": "cloud hosting providers",
"search_limit": 10,
"return_format": "markdown",
"country_code": "de"
}' Popular Search API Use Cases
RAG with Live Web Data
Feed a user's question into the search API, retrieve relevant pages, and pass the content to an LLM for grounded, up-to-date answers.
Market Research
Search for competitor names, product categories, or industry terms and automatically collect the latest information from multiple sources.
Content Curation
Build automated content pipelines that discover and aggregate the best articles on specific topics for newsletters or research reports.
SERP Monitoring
Track how search results change over time for keywords that matter to your business. Compare results across different regions.
Related Resources
Search the web programmatically
Combine search discovery with content extraction in a single API call.