Guides / Extract Leads

Extract Leads

Extract contact information from any website using Spider's AI-powered pipeline. Emails, phone numbers, and more.

2 min read Jeff Mendez

Extract Leads from Websites

Spider can extract contact information (emails, phone numbers, names, titles) from any website using AI. The crawler handles page access and anti-bot measures, while AI models pull structured contact data from the page content regardless of HTML layout.

Extract from the Dashboard

The fastest way to extract contacts is through the Spider dashboard:

  1. Crawl the target website from your dashboard.
  2. Open the page you want to extract contacts from.
  3. Click the dropdown menu and select Extract Contact.
  4. Wait 10-60 seconds for AI processing.

Extracting contacts with the spider app

Results appear in a grid showing name, email, phone, title, and source website.

The menu displaying the found contacts after extracting with the spider app

Grid display of all the contact information found for the web page

Extract via API

Use the /pipeline/extract-contacts endpoint to extract contacts programmatically. All parameters are optional except url. Use prompt to customize how the AI handles extraction. Set store_data to save extracted contacts with the page in your dashboard.

import requests, os, json

headers = {
    'Authorization': f'Bearer {os.getenv("SPIDER_API_KEY")}',
    'Content-Type': 'application/json',
}

response = requests.post('https://api.spider.cloud/v1/pipeline/extract-contacts',
  headers=headers,
  json={
    "url": "https://example.com/team",
    "limit": 1,
    "model": "gpt-4o",
    "prompt": "Extract all team member contact information"
  },
  stream=True
)

for line in response.iter_lines():
    if line:
        print(json.loads(line))

For large sites, you can save credits by filtering links before running extraction. Use the /links endpoint to gather all URLs, then /pipeline/filter-links to narrow down to pages likely to contain contacts (like /team, /about, /contact pages), and finally run /pipeline/extract-contacts only on the filtered set.

Loading graph...

This pipeline approach can cut extraction costs significantly on sites with hundreds of pages where only a few contain contact information.

Note: The /pipeline/filter-links endpoint is deprecated. Use CSS selectors or crawl parameters to filter URLs instead.

Empower any project with
AI-ready data

Join thousands of developers using Spider to power their data pipelines.