LlamaIndex Integration

Use Spider as a LlamaIndex web reader to load crawled pages directly into your index. Python only.

Install

Install LlamaIndex and the Spider client:

Install the Spider client

pip install spider_client

Set your API key as SPIDER_API_KEY in your environment.

Usage

Load web pages into LlamaIndex documents with SpiderWebReader.

Scrape a single page

from llama_index.readers.web import SpiderWebReader spider_reader = SpiderWebReader( api_key="YOUR_API_KEY", mode="scrape", # params={} # Optional parameters ) documents = spider_reader.load_data(url="https://spider.cloud") print(documents)

Crawl multiple pages

from llama_index.readers.web import SpiderWebReader spider_reader = SpiderWebReader( api_key="YOUR_API_KEY", mode="crawl", # params={} # Optional parameters ) documents = spider_reader.load_data(url="https://spider.cloud") print(documents)

Pass additional parameters via the params dictionary. See the API reference for all options.