NEW AI Studio is now available Try it now
Health

PubMed Health Scraper

Extract biomedical literature citations, abstracts, author affiliations, and journal metadata from PubMed. Powered by spider-browser .

Get Started Documentation
pubmed.ncbi.nlm.nih.gov target
99.5% success rate
~4ms latency
Quick Start

Extract data in minutes

pubmed-health-scraper.ts
import { SpiderBrowser } from "spider-browser";

const spider = new SpiderBrowser({
  apiKey: process.env.SPIDER_API_KEY!,
});

await spider.connect();
const page = spider.page!;
await page.goto("https://pubmed.ncbi.nlm.nih.gov/?term=diabetes+treatment");

const data = await page.evaluate(`(() => {
  const articles = [];
  document.querySelectorAll(".docsum-content").forEach(el => {
    const title = el.querySelector(".docsum-title")?.textContent?.trim();
    const authors = el.querySelector(".docsum-authors")?.textContent?.trim();
    const journal = el.querySelector(".docsum-journal-citation")?.textContent?.trim();
    const pmid = el.closest("[data-docid]")?.getAttribute("data-docid");
    if (title) articles.push({ title, authors, journal, pmid });
  });
  return JSON.stringify({ total: articles.length, articles: articles.slice(0, 10) });
})()`);

console.log(JSON.parse(data));
await spider.close();
✓ ready to run | spider-browser | TypeScript
Fetch API

Structured data endpoint

Extract structured JSON from pubmed.ncbi.nlm.nih.gov with a single POST request. AI-configured selectors, cached for fast repeat calls.

POST /fetch/pubmed.ncbi.nlm.nih.gov/
Article titleAuthorsAbstractJournalDOIPMID
curl
curl -X POST https://api.spider.cloud/fetch/pubmed.ncbi.nlm.nih.gov/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"return_format": "json"}'
Python
import requests

resp = requests.post(
    "https://api.spider.cloud/fetch/pubmed.ncbi.nlm.nih.gov/",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={"return_format": "json"},
)
print(resp.json())
Node.js
const resp = await fetch("https://api.spider.cloud/fetch/pubmed.ncbi.nlm.nih.gov/", {
  method: "POST",
  headers: {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ return_format: "json" }),
});
const data = await resp.json();
console.log(data);
Extraction

Data you can extract

Article titleAuthorsAbstractJournalDOIPMIDPublication dateKeywords
Content

Medical data extraction

Extract drug info, conditions, and health articles from pubmed.ncbi.nlm.nih.gov.

Parsing

Structured health data

Clean extraction of dosage, interactions, and clinical information.

Scale

Bulk research

Process thousands of medical pages for research and comparison datasets.

Related

More Health scrapers

Start scraping pubmed.ncbi.nlm.nih.gov

Get your API key and start extracting data in minutes.