NEW AI Studio is now available Try it now
POST /links

Links API

Discover every URL on a website without the overhead of content extraction. The Links endpoint crawls a domain and returns the complete link structure, faster and cheaper than a full crawl when you only need URLs.

Links vs. Crawl: Choose the Right Tool

Use Links When

  • You only need the list of URLs, not the page content
  • You're building a sitemap or URL index
  • You want to discover pages before selectively scraping them
  • Cost efficiency matters since Links skips content processing

Use Crawl When

  • You need the actual page content alongside URLs
  • You want markdown, text, or HTML output
  • You need metadata, headers, or cookies from each page
  • You're building a dataset or knowledge base

Key Capabilities

Lower Cost per URL

By skipping content extraction and format conversion, the Links endpoint uses fewer credits per page. Ideal for large-scale discovery across thousands of pages.

Faster Response

Without the overhead of rendering and extracting content, Links returns results faster. Lower latency means quicker iteration on URL discovery workflows.

Full Crawl Parameters

Use the same depth, limit, subdomain, and TLD controls as the crawl endpoint. Fine-tune discovery scope with all the same knobs.

Subdomain Discovery

Enable subdomains to discover URLs across all subdomains of a domain. Map the full structure of organizations with complex web presence.

External Domain Linking

Track outbound links to external domains using external_domains. Analyze a site's link profile and relationships.

Streaming Output

Stream discovered URLs as they're found using JSONL content type. Build real-time pipelines that process URLs as the crawl progresses.

Code Examples

Discover all URLs on a website Python
from spider import Spider

client = Spider()

# Get all URLs from a website
links = client.links(
    "https://example.com",
    params={
        "limit": 0,  # No limit - discover everything
        "subdomains": True,
    }
)

for link in links:
    print(link["url"])
print(f"Total: {len(links)} URLs found")
Collect links with streaming cURL
curl -X POST https://api.spider.cloud/links \
  -H "Authorization: Bearer $SPIDER_API_KEY" \
  -H "Content-Type: application/jsonl" \
  -d '{
    "url": "https://example.com",
    "limit": 1000,
    "depth": 5,
    "subdomains": true
  }'
Discover then selectively scrape JavaScript
import Spider from "@spider-cloud/spider-client";

const client = new Spider();

// Step 1: Discover all URLs
const links = await client.links("https://example.com", {
  limit: 500,
});

// Step 2: Filter for blog posts
const blogUrls = links
  .map(l => l.url)
  .filter(url => url.includes("/blog/"));

// Step 3: Scrape only the blog posts
const content = await client.scrape(blogUrls.join(","), {
  return_format: "markdown",
});

Popular Links API Use Cases

Sitemap Generation

Build comprehensive sitemaps by discovering every URL on a website. Find pages that might be missing from the existing sitemap.xml.

SEO Link Auditing

Map internal link structure to identify orphan pages, broken links, and opportunities to improve site architecture for better search ranking.

Pre-Crawl Discovery

Discover URLs first, filter them programmatically, then scrape only the pages you need. More efficient than crawling everything when you only want specific page types.

Change Detection

Periodically collect links to detect new pages, removed pages, or URL structure changes. Monitor website growth and content strategy shifts.

Related Resources

Map any website's URL structure

Discover every page on a website, fast and cost-efficient. Build sitemaps, audit links, and plan targeted scraping.