LangChain Integration
Use Spider as a LangChain document loader to feed crawled web content directly into your LLM chains and RAG pipelines.
Usage
Load documents from a URL using the SpiderLoader.
Scrape with LangChain in Python
from langchain_community.document_loaders import SpiderLoader
loader = SpiderLoader(
api_key="YOUR_API_KEY",
url="https://spider.cloud",
mode="scrape",
# params={} # Optional parameters
)
data = loader.load()
print(data)Modes
Set the mode to control how Spider collects content.
scrape: Default mode that scrapes a single URLcrawl: Crawls a website and follows links to scrape all pages
Custom Parameters
Pass a params dictionary to customize the request. Defaults to return_format: "markdown" and metadata: True. See the full list of parameters in the API reference.