2 min read
Getting started with Proxy Mode
Contents
- What is the proxy mode?
- Spider Proxy Features
- HTTP Proxies built to scale
- Proxy Usage
- Request Cost Structure
What is the proxy mode?
Spider also offers a proxy front-end to the API. This can make integration with third-party tools easier. The Proxy mode only changes the way you access Spider. The Spider API will then handle requests just like any standard request.
Request cost, return code and default parameters will be the same as a standard no-proxy request.
We recommend disabling JavaScript rendering in proxy mode, which is enabled by default. The following credentials and configurations are used to access the proxy mode:
- HTTP address:
proxy.spider.cloud:8888
- HTTPS address:
proxy.spider.cloud:8889
- Username:
YOUR-API-KEY
- Password:
PARAMETERS
Important: Replace PARAMETERS
with our supported API parameters. If you don’t know what to use, you can begin by using render_js=False
. If you want to use multiple parameters, use &
as a delimiter, example: render_js=False&premium_proxy=True
.
As an alternative, you can use URLs like the following:
{
"url": "http://proxy.spider.cloud:8888",
"username": "YOUR-API-KEY",
"password": "render_js=False&premium_proxy=True"
}
Spider Proxy Features
- Premium proxy rotations: no more headaches dealing with IP blocks
- Cost-effective: $1 per gb.
- Full concurrency: crawl thousands of pages in seconds, yes that isn’t a typo!
- Caching: repeated page crawls ensures a speed boost in your crawls
- Optimal response format: Get clean and formatted markdown, HTML, or text for LLM and AI agents
- Avoid anti-bot detection: measures that further lower the chances of crawls being blocked
- And many more
HTTP Proxies built to scale
At the time of writing, we now have http and https proxying capabilities to leverage Spider to gather data.
Proxy Usage
Getting started with the proxy is simple and straightforward. After you get your secret key you can access our instance directly.
import requests, os
# Proxy configuration
proxies = {
'http': f"http://{os.getenv('SPIDER_API_KEY')}:request=Raw&premium_proxy=False@proxy.spider.cloud:8888",
'https': f"https://{os.getenv('SPIDER_API_KEY')}:request=Raw&premium_proxy=False@proxy.spider.cloud:8889"
}
# Function to make a request through the proxy
def get_via_proxy(url):
try:
response = requests.get(url, proxies=proxies)
response.raise_for_status()
print('Response HTTP Status Code: ', response.status_code)
print('Response HTTP Response Body: ', response.content)
return response.text
except requests.exceptions.RequestException as e:
print(f"Error: {e}")
return None
# Example usage
if __name__ == "__main__":
get_via_proxy("https://www.choosealicense.com")
get_via_proxy("https://www.choosealicense.com/community")
Request Cost Structure
Request Type | Cost (Credits) |
---|---|
Base request (HTTP) | 1 credit |
Premium proxies | 2 credits |
The premium proxies used for Proxy-Mode is a bit different than the proxies used for generic crawling. The generic crawling proxy has AI built into it and anti-bot capabilities.
Coming soon
Some of the params and socks5 are not available at the time of writing.
- JS rendering.
- Transforming to markdown etc.
- Readability.