Point Spider at any URL and it recursively discovers every page on the domain. Results stream back as they are found, so you can process pages before the full crawl completes. Set depth limits, page caps, and output format to control exactly what you get back.

Key capabilities

  • Recursive link following with configurable depth
  • Streaming JSONL output for real-time processing
  • Markdown, HTML, plain text, or raw byte output
  • Page limit controls to stay within budget
  • Automatic duplicate URL detection
  • Batch multiple seed URLs in one request