Data.gov Scraper
Extract open government datasets, metadata, agency publishers, and download links from Data.gov catalog. Powered by spider-browser .
Extract data in minutes
import { SpiderBrowser } from "spider-browser";
const spider = new SpiderBrowser({
apiKey: process.env.SPIDER_API_KEY!,
});
await spider.connect();
const page = spider.page!;
await page.goto("https://catalog.data.gov/dataset?q=climate&sort=score+desc%2C+name+asc");
await page.content();
const data = await page.evaluate(`(() => {
const datasets = [];
document.querySelectorAll(".dataset-item").forEach(el => {
const title = el.querySelector(".dataset-heading a")?.textContent?.trim();
const org = el.querySelector(".dataset-organization")?.textContent?.trim();
const description = el.querySelector(".dataset-notes")?.textContent?.trim();
const formats = [];
el.querySelectorAll(".dataset-resources .label").forEach(f => formats.push(f.textContent?.trim()));
if (title) datasets.push({ title, org, description: description?.slice(0, 200), formats });
});
return JSON.stringify({ total: datasets.length, datasets: datasets.slice(0, 10) });
})()`);
console.log(JSON.parse(data));
await spider.close(); Structured data endpoint
Extract structured JSON from data.gov with a single POST request. AI-configured selectors, cached for fast repeat calls.
curl -X POST https://api.spider.cloud/fetch/data.gov/ \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"return_format": "json"}' import requests
resp = requests.post(
"https://api.spider.cloud/fetch/data.gov/",
headers={
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
json={"return_format": "json"},
)
print(resp.json()) const resp = await fetch("https://api.spider.cloud/fetch/data.gov/", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({ return_format: "json" }),
});
const data = await resp.json();
console.log(data); Data you can extract
Public records
Extract filings, regulations, and public data from data.gov.
Document extraction
Clean extraction of legal documents, PDFs, and structured public records.
Bulk processing
Process thousands of filings and regulatory documents concurrently.
More Government & Legal scrapers
Extract SEC filings, company financial reports, insider trading data, and regulatory submissions from EDGAR.
Extract patent applications, trademark filings, examiner data, and prosecution history from USPTO.
Extract bill text, voting records, committee reports, and legislative history from Congress.gov.
Start scraping data.gov
Get your API key and start extracting data in minutes.