Crawl plus AI in one request.
Contacts, Q&A pairs, category labels, AI-filtered links — pulled from a crawl in a single pipeline call. Deprecated in favor of Fetch and Scrape.
Each pulls a different shape.
Extract contacts
POST /pipeline/extract-contactsCrawl websites and use AI to identify and extract contact information including email addresses, phone numbers, social profiles, and business details. Results are stored and queryable via the contacts data API.
Questions & answers
POST /pipeline/extract-qaCrawl a website and generate structured Q&A pairs from its content. Provide an inquiry or topic and Spider produces relevant questions with answers grounded in the actual page content.
Label website
POST /pipeline/labelCrawl a website and have AI categorize it into topics, industries, or custom labels. Useful for building directories, classifying leads, or organizing large collections of URLs.
Filter links
POST /pipeline/filter-linksCrawl a website's links and use AI to filter them based on relevance, content type, or custom criteria. Keep only the URLs that match your data collection goals, eliminating noise.
Bonus: crawl from text
POST /pipeline/crawl-textPaste raw text or markdown containing URLs, and Spider will automatically extract every link and crawl them. Skip the step of parsing URLs yourself. Just send the document, email body, or notes and let Spider handle discovery. Supports up to 10 MB of input text.
cURL, Python, Node.
from spider import Spider
client = Spider()
# Extract contacts from a company website
contacts = client.extract_contacts(
"https://example.com",
params={
"limit": 50,
},
)
for contact in contacts:
print(contact)Where teams reach for it.
Sales lead generation
Crawl target company websites to extract emails, phone numbers, and team member details. Build prospect lists automatically instead of manual research.
Fine-tuning datasets
Generate Q&A pairs from documentation sites and knowledge bases to create training data for domain-specific language models and chatbots.
Website directories
Label and categorize large collections of URLs for building topical directories, industry databases, or content recommendation systems.
Smart link discovery
Filter a website's links to find only product pages, blog posts, or documentation. Skip navigation, legal pages, and irrelevant content.
More from the API.
Extract intelligence from any website.
Combine web crawling with AI understanding. Pull structured data that goes beyond raw HTML.