LlamaIndex Integration
LlamaIndex is a framework that connects language models to external data sources, enabling efficient data retrieval and querying for applications like chatbots and search systems. This integration is only available for the Python SDK.
Install Spider client
Make sure to install LlamaIndex first, see install instructions here.
Install the Spider client
pip install spider-client
Create an API key, then store it as an environment variable. This key will allow you to access the Spider API securely. If no API key is provided it looks for SPIDER_API_KEY
in the env.
Usage
Get started by looking at the code examples below
Scrape a single page
from llama_index.readers.web import SpiderWebReader
spider_reader = SpiderWebReader(
api_key="YOUR_API_KEY",
mode="scrape",
# params={} # Optional parameters
)
documents = spider_reader.load_data(url="https://spider.cloud")
print(documents)
Crawl multiple pages
from llama_index.readers.web import SpiderWebReader
spider_reader = SpiderWebReader(
api_key="YOUR_API_KEY",
mode="crawl",
# params={} # Optional parameters
)
documents = spider_reader.load_data(url="https://spider.cloud")
print(documents)
Parameters can be passed in as a dictionary. View extra parameters to use in our API documentation.