LangChain Integration

Use LangChain with Spider as a document loader. LangChain is a framework designed for building applications powered by large language models.

Install LangChain and the Spider client

Make sure to install LangChain first, see install instructions here.

Install the Spider client using pip

pip install spider_client

Create an API key, then store it as an environment variable. This key will allow you to access the Spider API securely. If no API key is provided it looks for SPIDER_API_KEY in the env.

Usage

Get started by looking at the code example below.

Scrape example using LangChain in Python

from langchain_community.document_loaders import SpiderLoader  

loader = SpiderLoader(  
    api_key="YOUR_API_KEY",  
    url="https://spider.cloud",  
    mode="scrape",
    # params={} # Optional parameters
)  

data = loader.load()  
print(data)

Modes

Pick between crawling or scraping to get content.

scrape: Default mode that scrapes a single URL
crawl: Crawls a website and follows links to scrape all pages

Custom Parameters

The params parameter is a dictionary that can be passed to the loader. By default, parameters return_format is set to markdown, and metadata is set to True. View extra parameters to use in our API documentation.