LlamaIndex Integration
Use Spider as a LlamaIndex web reader to load crawled pages directly into your index. Python only.
Install
Install LlamaIndex and the Spider client:
Install the Spider client
pip install spider_clientSet your API key as SPIDER_API_KEY in your environment.
Usage
Load web pages into LlamaIndex documents with SpiderWebReader.
Scrape a single page
from llama_index.readers.web import SpiderWebReader
spider_reader = SpiderWebReader(
api_key="YOUR_API_KEY",
mode="scrape",
# params={} # Optional parameters
)
documents = spider_reader.load_data(url="https://spider.cloud")
print(documents)Crawl multiple pages
from llama_index.readers.web import SpiderWebReader
spider_reader = SpiderWebReader(
api_key="YOUR_API_KEY",
mode="crawl",
# params={} # Optional parameters
)
documents = spider_reader.load_data(url="https://spider.cloud")
print(documents)Pass additional parameters via the params dictionary. See the API reference for all options.