Rate Limits

The Spider API enforces rate limits to ensure fair usage and service stability. Rate limits vary by plan and are applied on a per-minute basis.

Rate Limit Tiers

Each plan has different rate limits. Upgrade your plan to increase your limits.

Plan	Requests / Minute	Concurrent Requests
Free	20	1
Starter	100	5
Growth	500	20
Enterprise	Custom	Custom

Rate Limit Headers

Every API response includes rate limit information in the response headers. Use these headers to monitor your usage and avoid hitting limits.

Header	Description
`RateLimit-Limit`	The maximum number of requests allowed per minute for your plan.
`RateLimit-Remaining`	The number of requests remaining in the current window.
`RateLimit-Reset`	Seconds until the current rate limit window resets.
`Retry-After`	Seconds to wait before retrying. Only present on 429 Too Many Requests responses.

Handling 429 Errors

When you exceed the rate limit, the API returns a 429 Too Many Requests response. Implement exponential backoff to retry requests gracefully.

Python - Exponential Backoff

import requests
import time
import os

def request_with_backoff(url, headers, json_data, max_retries=5):
    """Make an API request with exponential backoff on rate limits."""
    for attempt in range(max_retries):
        response = requests.post(url, headers=headers, json=json_data)

        if response.status_code == 429:
            # Use Retry-After header, or fall back to exponential backoff
            retry_after = response.headers.get('Retry-After')
            wait_time = int(retry_after) if retry_after else 2 ** attempt
            print(f"Rate limited. Retrying in {wait_time}s...")
            time.sleep(wait_time)
            continue

        # Check remaining quota to proactively slow down
        remaining = response.headers.get('RateLimit-Remaining')
        if remaining and int(remaining) < 10:
            reset = int(response.headers.get('RateLimit-Reset', 1))
            time.sleep(reset)

        return response

    raise Exception("Max retries exceeded")

headers = {
    'Authorization': f'Bearer {os.getenv("SPIDER_API_KEY")}',
    'Content-Type': 'application/json',
}

result = request_with_backoff(
    'https://api.spider.cloud/crawl',
    headers,
    {"url": "https://example.com", "limit": 5}
)
print(result.json())

Node.js - Exponential Backoff

async function requestWithBackoff(url, options, maxRetries = 5) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After');
      const waitTime = retryAfter ? Number(retryAfter) : Math.pow(2, attempt);
      console.log(`Rate limited. Retrying in ${waitTime}s...`);
      await new Promise((r) => setTimeout(r, waitTime * 1000));
      continue;
    }

    // Check remaining quota to proactively slow down
    const remaining = response.headers.get('RateLimit-Remaining');
    if (remaining && Number(remaining) < 10) {
      const reset = Number(response.headers.get('RateLimit-Reset') || 1);
      await new Promise((r) => setTimeout(r, reset * 1000));
    }

    return response;
  }

  throw new Error('Max retries exceeded');
}

const result = await requestWithBackoff('https://api.spider.cloud/crawl', {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${process.env.SPIDER_API_KEY}`,
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({ url: 'https://example.com', limit: 5 }),
});
console.log(await result.json());

Pro Tip:

Use streaming for large crawls instead of making many individual requests. Streaming uses a single connection and is not subject to per-request rate limits.

Best Practices

Follow these guidelines to stay within rate limits and get the best performance.

Batch requests: Use the limit parameter to crawl multiple pages in a single request instead of making separate requests for each page.
Monitor headers: Check RateLimit-Remaining before making additional requests. Pause when approaching zero.
Use webhooks: For long-running crawls, set up webhooks to receive results asynchronously instead of polling.
Cache results: Store crawl results locally to avoid re-fetching the same pages.
Use streaming: For crawling many pages, use concurrent streaming to process results as they arrive.