Guides / Spider Search (SERP)

Spider Search (SERP)

Search the web and optionally scrape results in a single API call. Built for LLM pipelines, agents, and data collection.

5 min read Jeff Mendez

Searching with Spider

Spider’s search endpoint returns results in under 2 seconds. Combine it with scraping to discover and extract content in a single request. Common use cases:

  • Feeding real-time content into large language models (LLMs)
  • Building intelligent agents and data pipelines
  • Crawling and collecting fresh, targeted data

Search Endpoint Usage

POST /search Use this endpoint to compile a list of relevant websites for crawling and resource collection.

Search Parameters

ParameterTypeRequiredDefaultDescription
searchstringYesThe search query to execute. Supports advanced search operators (e.g. site:, intitle:, filetype:).
search_limitintegerNo0 (all)Maximum number of results to return. Acts as a filter on top of the results. Set to 0 to return all results.
numintegerNo10Number of results to request per search page. Controls how many results the search engine returns per page.
pageintegerNo1The search results page to retrieve. Use with num for manual pagination.
fetch_page_contentbooleanNofalseWhen true, Spider crawls each search result URL and returns the full page content. Standard crawl parameters apply (see below).
locationstringNoThe geographic location the search should originate from (e.g. "San Diego, CA", "London, UK").
countrystringNoTwo-letter country code to prioritize results from (e.g. "us", "fr", "de").
languagestringNoTwo-letter language code for search results (e.g. "en", "es", "ja").
tbsstringNoTime-based search filter. Restricts results to a specific time period. See values below.
quick_searchbooleanNotrueEnables fast search mode with parallel provider queries and automatic retries for broader coverage. Set to false for single-provider queries.
auto_paginationbooleanNofalseAutomatically paginate through search result pages until search_limit is reached. Useful for collecting large result sets without manually incrementing page.

Time-Based Search Filters (tbs)

ValueDescription
qdr:hPast hour
qdr:dPast 24 hours
qdr:wPast week
qdr:mPast month
qdr:yPast year

Crawl Parameters (when fetch_page_content is true)

When fetch_page_content is enabled, all standard crawl parameters are available alongside search parameters. These let you control how Spider processes each result URL. Common options include:

ParameterTypeDescription
return_formatstringContent format for crawled pages: "markdown", "raw", "text", "html2text", etc.
limitintegerMaximum pages to crawl per result URL. Set to 1 to scrape only the landing page.
readabilitybooleanApply readability pre-processing to extract the main article content.
root_selectorstringCSS selector to extract only specific content from each page.
exclude_selectorstringCSS selector to exclude elements from the extracted content.
proxystringProxy type for the crawl (e.g. "residential", "datacenter").
request_timeoutintegerTimeout in seconds for each page request.
headersobjectCustom HTTP headers to send with each crawl request.
localestringLocale for the crawl request (e.g. "en-US").
stealthbooleanEnable stealth mode for the browser.
webhooksobjectWebhook destination to stream results to as they arrive.

For the full list of crawl parameters, see the Scraping and Crawling docs.

Example Request (Python)

import requests, os

headers = {
  'Authorization': f'Bearer {os.getenv("SPIDER_API_KEY")}',
  'Content-Type': 'application/json',
}

json_data = {
  "search": "sports news today",
  "search_limit": 5
}

response = requests.post('https://api.spider.cloud/search', headers=headers, json=json_data)
print(response.json())

Search Results Format

The API returns structured results as an array of objects:

[
	{
		"title": "ESPN – Serving Sports Fans. Anytime. Anywhere.",
		"description": "Visit ESPN for live scores, highlights and sports news. Stream exclusive games on ESPN+ and play fantasy sports.",
		"url": "https://www.espn.com/"
	},
	{
		"title": "Sports Illustrated",
		"description": "SI.com provides sports news, expert analysis, highlights, stats and scores for the NFL, NBA, MLB, NHL, college football, soccer...",
		"url": "https://www.si.com/"
	}
]

Search and Scrape in One Request

Set fetch_page_content: true to search and scrape in one request. All standard crawl parameters work alongside search parameters, so you can control output format, depth, and proxy settings.

import requests, os

headers = {
  'Authorization': f'Bearer {os.getenv("SPIDER_API_KEY")}',
  'Content-Type': 'application/json',
}

json_data = {
  "search": "spider web crawling",
  "return_format": "raw",
  "fetch_page_content": True,
  "search_limit": 10,
  "limit": 1
}

response = requests.post('https://api.spider.cloud/search', headers=headers, json=json_data)
print(response.json())

When fetch_page_content is enabled, the response format changes to include full crawl data:

[
	{
		"error": null,
		"status": 200,
		"duration_elasped_ms": 120,
		"costs": {
			"file_cost": 0.000363,
			"ai_cost": 0,
			"compute_cost": 7e-8,
			"transform_cost": 0,
			"total_cost": 0.00036307,
			"bytes_transferred_cost": 0
		},
		"url": "https://en.wikipedia.org/wiki/Web_crawler",
		"content": "<!DOCTYPE html><html><body>content...</body></html>"
	}
]

This is useful for:

  • Extracting full content from top search results
  • Automated research and summarization pipelines
  • Reducing round-trips in data collection workflows

Use location, country, and language to get results as a user in a specific region would see them.

import requests, os

headers = {
  'Authorization': f'Bearer {os.getenv("SPIDER_API_KEY")}',
  'Content-Type': 'application/json',
}

json_data = {
  "search": "latest sports news",
  "search_limit": 5,
  "language": "en",
  "country": "us",
  "location": "San Diego, CA"
}

response = requests.post('https://api.spider.cloud/search', headers=headers, json=json_data)
print(response.json())

Restrict results to a specific time period with the tbs parameter.

import requests, os

headers = {
  'Authorization': f'Bearer {os.getenv("SPIDER_API_KEY")}',
  'Content-Type': 'application/json',
}

json_data = {
  "search": "AI breakthroughs",
  "search_limit": 10,
  "tbs": "qdr:w"
}

response = requests.post('https://api.spider.cloud/search', headers=headers, json=json_data)
print(response.json())

Auto-Pagination

Set auto_pagination: true to automatically collect results across multiple search pages until your search_limit is reached.

import requests, os

headers = {
  'Authorization': f'Bearer {os.getenv("SPIDER_API_KEY")}',
  'Content-Type': 'application/json',
}

json_data = {
  "search": "machine learning tutorials",
  "search_limit": 50,
  "auto_pagination": True
}

response = requests.post('https://api.spider.cloud/search', headers=headers, json=json_data)
print(response.json())

Batch Multiple Queries

Send an array of query objects to execute multiple searches in a single API call. Each query runs independently and returns its own result set. Streaming is not supported for batch requests.

import requests, os

headers = {
  'Authorization': f'Bearer {os.getenv("SPIDER_API_KEY")}',
  'Content-Type': 'application/json',
}

json_data = [
  {
    "search": "latest sports news united states",
    "search_limit": 5
  },
  {
    "search": "latest news around the globe",
    "search_limit": 5
  }
]

response = requests.post('https://api.spider.cloud/search', headers=headers, json=json_data)
print(response.json())

Rate Limits

  • Up to 50,000 search requests per minute
  • Multiple search providers for redundancy
  • Distributed crawling and parsing for fetched content

Empower any project with AI-ready data

Join thousands of developers using Spider to power their data pipelines.