Getting started with Proxy Mode

What is the proxy mode?
Spider Proxy Features
HTTP Proxies built to scale
Proxy Usage
- Coming soon

What is the proxy mode?

Spider also offers a proxy front-end to the API. This can make integration with third-party tools easier. The Proxy mode only changes the way you access Spider. The Spider API will then handle requests just like any standard request.

Request cost, return code and default parameters will be the same as a standard no-proxy request.

We recommend disabling JavaScript rendering in proxy mode, which is enabled by default. The following credentials and configurations are used to access the proxy mode:

HTTP address: proxy.spider.cloud:8888
HTTPS address: proxy.spider.cloud:8889
Username: YOUR-API-KEY
Password: PARAMETERS

Important: Replace PARAMETERS with our supported API parameters. If you don’t know what to use, you can begin by using render_js=False. If you want to use multiple parameters, use & as a delimiter, example: render_js=False&premium_proxy=True.

As an alternative, you can use URLs like the following:

{
	"url": "http://proxy.spider.cloud:8888",
	"username": "YOUR-API-KEY",
	"password": "render_js=False&premium_proxy=True"
}

Spider Proxy Features

Premium proxy rotations: no more headaches dealing with IP blocks
Cost-effective: $1 per gb.
Full concurrency: crawl thousands of pages in seconds, yes that isn’t a typo!
Caching: repeated page crawls ensures a speed boost in your crawls
Optimal response format: Get clean and formatted markdown, HTML, or text for LLM and AI agents
Avoid anti-bot detection: measures that further lower the chances of crawls being blocked
And many more

HTTP Proxies built to scale

At the time of writing, we now have http and https proxying capabilities to leverage Spider to gather data.

Proxy Usage

Getting started with the proxy is simple and straightforward. After you get your secret key you can access our instance directly. Some of the params from the core API can be passed into the proxy.

import requests, os

# Proxy configuration
proxies = {
    'http': f"http://{os.getenv('SPIDER_API_KEY')}:country_code=na&premium_proxy=False@proxy.spider.cloud:8888",
    'https': f"https://{os.getenv('SPIDER_API_KEY')}:country_code=na&premium_proxy=False@proxy.spider.cloud:8889"
}

# Function to make a request through the proxy
def get_via_proxy(url):
    try:
        response = requests.get(url, proxies=proxies)
        response.raise_for_status()
        print('Response HTTP Status Code: ', response.status_code)
        print('Response HTTP Response Body: ', response.content)
        return response.text
    except requests.exceptions.RequestException as e:
        print(f"Error: {e}")
        return None

# Example usage
if __name__ == "__main__":
     get_via_proxy("https://www.example.com")
     get_via_proxy("https://www.example.com/community")

The premium proxies used for Proxy-Mode is a bit different than the proxies used for generic crawling. The generic crawling proxy has AI built into it and anti-bot capabilities.

Coming soon

Some of the params and socks5 are not available at the time of writing.

JS rendering.
Transforming to markdown etc.
Readability.

Empower any project with AI-ready data for LLMs

Get started