Data.gov Scraper
Extract open government datasets, metadata, agency publishers, and download links from Data.gov catalog. Built on spider-browser .
- target
- data.gov
- success rate
- 99.9%
- latency
- ~4ms
Extract data in minutes.
import { SpiderBrowser } from "spider-browser";
const spider = new SpiderBrowser({
apiKey: process.env.SPIDER_API_KEY!,
});
await spider.connect();
const page = spider.page!;
await page.goto("https://catalog.data.gov/dataset?q=climate&sort=score+desc%2C+name+asc");
await page.content();
const data = await page.evaluate(`(() => {
const datasets = [];
document.querySelectorAll(".dataset-item").forEach(el => {
const title = el.querySelector(".dataset-heading a")?.textContent?.trim();
const org = el.querySelector(".dataset-organization")?.textContent?.trim();
const description = el.querySelector(".dataset-notes")?.textContent?.trim();
const formats = [];
el.querySelectorAll(".dataset-resources .label").forEach(f => formats.push(f.textContent?.trim()));
if (title) datasets.push({ title, org, description: description?.slice(0, 200), formats });
});
return JSON.stringify({ total: datasets.length, datasets: datasets.slice(0, 10) });
})()`);
console.log(JSON.parse(data));
await spider.close(); One endpoint for data.gov.
Structured JSON from data.gov with a single POST. AI-resolved selectors, cached on the first call.
curl -X POST https://api.spider.cloud/fetch/data.gov/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"return_format": "json"}' import requests
resp = requests.post(
"https://api.spider.cloud/fetch/data.gov/",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
json={"return_format": "json"},
)
print(resp.json()) const resp = await fetch("https://api.spider.cloud/fetch/data.gov/", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({ return_format: "json" }),
});
const data = await resp.json();
console.log(data); Fields you can pull.
Public records
Extract filings, regulations, and public data from data.gov.
Document extraction
Clean extraction of legal documents, PDFs, and structured public records.
Bulk processing
Process thousands of filings and regulatory documents concurrently.
More Government & Legal scrapers.
SEC EDGAR Government Scraper
Extract SEC filings, company financial reports, insider trading data, and regulatory submissions from EDGAR.
USPTO Scraper
Extract patent applications, trademark filings, examiner data, and prosecution history from USPTO.
Congress.gov Scraper
Extract bill text, voting records, committee reports, and legislative history from Congress.gov.
Start scraping data.gov.
Grab an API key and call the endpoint above. The first request resolves the config; every request after hits cache.