LlamaIndex Integration
Use Spider as a LlamaIndexweb reader to load crawled pages directly into your index. Python only.
Install
Install LlamaIndexand the Spider client:
Install the Spider client
pip install spider_client
Set your API keyas SPIDER_API_KEY in your environment.
Usage
Load web pages into LlamaIndex documents with SpiderWebReader.
Scrape a single page
from llama_index.readers.web import SpiderWebReader spider_reader = SpiderWebReader( api_key="YOUR_API_KEY", mode="scrape", # params={} # Optional parameters ) documents = spider_reader.load_data(url="https://spider.cloud") print(documents)
Crawl multiple pages
from llama_index.readers.web import SpiderWebReader spider_reader = SpiderWebReader( api_key="YOUR_API_KEY", mode="crawl", # params={} # Optional parameters ) documents = spider_reader.load_data(url="https://spider.cloud") print(documents)
Pass additional parameters via the params dictionary. See the API referencefor all options.