Blog / Web Search API for AI Agents: Search, Scrape, and Extract in One Call

Web Search API for AI Agents: Search, Scrape, and Extract in One Call

Most AI agents need live web data but stitching together a SERP API, a scraper, and a parser is fragile and slow. Spider's Search API combines all three into a single request. Here's how it works and why it matters for agent reliability.

7 min read Jeff Mendez

I’ve talked to hundreds of teams building AI agents over the past year. The ones that ship something useful, that actually works in production, all run into the same bottleneck early on: the agent needs to know things that happened after the model was trained.

The training cutoff isn’t a minor annoyance. It’s a wall. Your agent can reason, plan, and write perfectly good code. But ask it about anything that happened last week and it either hallucinates or admits it doesn’t know. Neither is acceptable if you’re building something people depend on.

The obvious fix is web search. Let the agent Google things. The less obvious part is how painful that actually is to implement well.

Three services duct-taped together

Most teams end up with a stack that looks something like this:

# Step 1: Hit a SERP API
serp_results = serp_client.search("latest AI regulations 2026")
urls = [r["link"] for r in serp_results["organic_results"]]

# Step 2: Scrape each result
pages = []
for url in urls:
    try:
        html = scraper.fetch(url, render_js=True)
        text = html_to_markdown(html)
        pages.append({"url": url, "content": text})
    except Exception:
        continue  # hope for the best

# Step 3: Feed to LLM
context = "\n\n".join([p["content"] for p in pages])
response = llm.chat(f"Based on this context:\n{context}\n\nAnswer: ...")

Three API keys. Three billing models. Three sets of error handling. And an N+1 request pattern where each scrape happens sequentially because you don’t get the URLs until the search finishes.

It works in a demo notebook. In production, the cracks show up fast.

The SERP call takes 1-2 seconds. Each scrape takes 2-5 seconds. Five results means your user is staring at a loading spinner for 12-27 seconds before the LLM even begins generating. By then they’ve already switched tabs.

Then there’s the failure cascade. The SERP API returns URLs. One of them is behind Cloudflare. Another uses heavy client-side rendering your basic scraper can’t handle. A third returns a 403 because you’re hammering it from the same IP as every other developer using the same scraping service. Your agent gets back three results instead of five, and two of those are half-parsed nav menus mixed with article text.

You end up writing more glue code than agent logic. Retry wrappers, HTML cleaning pipelines, fallback scraping strategies, timeout handling. All of it just to answer the question “what happened today?”

One call instead of six

We built Spider’s Search API specifically to kill this problem. One POST request does everything: searches the web, crawls every result, handles rendering and anti-bot, cleans the content, and returns structured data.

from spider import Spider

client = Spider()

results = client.search(
    "latest AI regulations 2026",
    params={
        "search_limit": 5,
        "return_format": "markdown",
        "fetch_page_content": True,
    }
)

That’s the entire search-and-scrape pipeline. fetch_page_content is the key parameter. It tells Spider to actually visit each result URL and extract the content, not just return titles and snippets.

The response comes back with clean markdown from each page, ready to drop into an LLM context window:

[
  {
    "url": "https://example.com/ai-regulation-tracker",
    "status": 200,
    "content": "# AI Regulation Tracker\n\nThe EU AI Act entered...",
    "costs": {
      "total_cost": 0.00036
    }
  }
]

All the hard parts, proxy rotation, JavaScript rendering, CAPTCHA solving, HTML-to-markdown conversion, happen server-side. You don’t manage any of it.

What this looks like in a real agent

Here’s the pattern we see working best. It’s simple on purpose. The fewer moving parts between the user’s question and the LLM’s answer, the more reliable the system.

from spider import Spider
from openai import OpenAI
import os

spider = Spider(api_key=os.getenv("SPIDER_API_KEY"))
openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

def answer_with_web(question: str) -> str:
    # Search and scrape in one call
    results = spider.search(
        question,
        params={
            "search_limit": 5,
            "return_format": "markdown",
            "fetch_page_content": True,
            "readability": True,
        }
    )

    # Build context from results
    context_parts = []
    for r in results:
        if r.get("content"):
            content = r["content"][:3000]
            context_parts.append(f"Source: {r['url']}\n{content}")

    context = "\n\n---\n\n".join(context_parts)

    # Generate a grounded answer
    response = openai_client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": "Answer using only the provided web sources. Cite sources by URL."
            },
            {
                "role": "user",
                "content": f"## Web Sources\n\n{context}\n\n## Question\n\n{question}"
            }
        ]
    )

    return response.choices[0].message.content

Total latency: 3-5 seconds. The Spider search call takes 1-2 seconds (it searches and scrapes all results in parallel internally), and the LLM generates the answer in another 1-3 seconds. That’s a completely grounded, cited response faster than most people can read the question.

The readability flag is worth calling out. It strips navigation, sidebars, footers, and cookie banners from the scraped content. Without it, you’re burning context window tokens on boilerplate that actively confuses the model.

Features that matter for agents specifically

A few capabilities that don’t get enough attention, but change how you architect things:

Batch queries. Most search APIs process one query at a time. Spider accepts an array. If your agent needs to research three topics to answer one question, you send all three as a batch and Spider runs them in parallel. One HTTP call, three searches, all results back together.

queries = [
    {"search": "company X earnings Q1 2026", "search_limit": 3, "fetch_page_content": True, "return_format": "markdown"},
    {"search": "company X competitor landscape", "search_limit": 3, "fetch_page_content": True, "return_format": "markdown"},
    {"search": "company X SEC filings recent", "search_limit": 3, "fetch_page_content": True, "return_format": "markdown"}
]

response = requests.post("https://api.spider.cloud/search", headers=headers, json=queries)

That’s nine pages searched and scraped in a single round trip. Try doing that with three sequential SERP API calls plus twenty-seven individual scrape requests.

Time filters. When someone asks “what happened today,” you don’t want results from six months ago cluttering the context. The tbs parameter restricts results to the past hour (qdr:h), day (qdr:d), week (qdr:w), month (qdr:m), or year (qdr:y). This is the difference between a useful answer and a hallucinated one.

City-level geo-targeting. Pass location: "San Francisco, CA" along with country and language, and search results reflect what a user in that location would actually see. This matters for any agent handling local queries. “Best restaurants nearby” means completely different things in Tokyo and Toronto.

Auto-pagination. Set auto_pagination: true with a high search_limit, and Spider will paginate through search result pages automatically. No manual page incrementing. Useful when your agent needs deep research across 30 or 50 results instead of the usual 5.

50,000 requests per minute. If you’re building an agent that serves real traffic, rate limits matter. Firecrawl caps concurrency at 2-150 depending on your plan. Jina’s free tier is 500 RPM. Spider handles 50K. You don’t want your agent failing because twenty users asked questions at the same time.

How this compares

Honest comparison of what each search API actually gives you:

FeatureSpiderFirecrawlJinaSerpAPI
Search + full page scrapeOne callOne callSnippets onlySearch only
Batch multiple queriesYesNoNoNo
Auto-paginationYesNoNoManual
Rate limit50K/min2-150 concurrent500 RPM (free)5K/month
Geo-targetingCity-levelLimitedNoCountry
Time filters5 levelsYesNoYes
Output formatsMarkdown, HTML, text, rawMarkdownMarkdownJSON (no content)
Pricing~$0.003 per search+scrape$0.83/1K base + scrape extraFree (rate-limited)$50/mo base

Firecrawl is the closest feature match. The main practical differences are batch queries, the rate limit ceiling, and cost. Jina is great for quick single-page reads but doesn’t do combined search-and-scrape. SerpAPI gives you search results but you’re on your own for actually getting the content.

Streaming for faster time-to-first-token

If you’re streaming the LLM response to the user, you don’t need all search results before you start. Use JSONL content type and Spider streams each result back as soon as it’s scraped:

curl -X POST https://api.spider.cloud/search \
  -H "Authorization: Bearer $SPIDER_API_KEY" \
  -H "Content-Type: application/jsonl" \
  -d '{"search": "web scraping best practices", "search_limit": 10, "fetch_page_content": true, "return_format": "markdown"}'

You can start feeding the first two or three results to the LLM while results four through ten are still being fetched. This cuts perceived latency significantly.

When to use Search vs. Crawl vs. Scrape

Quick mental model for Spider’s three main endpoints:

Search is for when you don’t have URLs. You have a question or topic and need to discover relevant pages on the open web. This is the default for AI agents answering user questions.

Crawl is for when you have a starting URL and want to go deep. Follow links, index an entire site, build a knowledge base from documentation.

Scrape is for when you have specific URLs and want their content. Monitoring known pages, refreshing previously crawled data.

Most AI agents start every interaction with Search, then use Crawl or Scrape for follow-up tasks once they’ve identified the right sources.

Start searching

from spider import Spider

client = Spider()
results = client.search("your query", params={"search_limit": 5, "fetch_page_content": True, "return_format": "markdown"})

for r in results:
    print(r["url"], len(r.get("content", "")), "chars")

No subscription. Pay per request. Credits never expire. Average search + scrape costs less than $0.003 per query.

Create an account | Search API docs | Full search guide

Empower any project with
AI-ready data

Join thousands of developers using Spider to power their data pipelines.