LangChain Integration

Use Spider as a LangChain document loader to feed crawled web content directly into your LLM chains and RAG pipelines.

Install

Install LangChain and the Spider client:

Install the Spider client using pip

pip install spider_client

Set your API key as SPIDER_API_KEY in your environment.

Usage

Load documents from a URL using the SpiderLoader.

Scrape with LangChain in Python

from langchain_community.document_loaders import SpiderLoader loader = SpiderLoader( api_key="YOUR_API_KEY", url="https://spider.cloud", mode="scrape", # params={} # Optional parameters ) data = loader.load() print(data)

Modes

Set the mode to control how Spider collects content.

  • scrape: Default mode that scrapes a single URL
  • crawl: Crawls a website and follows links to scrape all pages

Custom Parameters

Pass a params dictionary to customize the request. Defaults to return_format: "markdown" and metadata: True. See the full list of parameters in the API reference.