Skip to main content gottem  — one API for every scraper.
Media
Verified

IMDb Scraper

Extract movie ratings, cast info, box office data, and reviews from IMDb. Built on spider-browser .

Get started Docs
target
imdb.com
success rate
99.9%
latency
~4ms
Quick start

Extract data in minutes.

imdb-scraper.ts
import { SpiderBrowser } from "spider-browser";

const spider = new SpiderBrowser({
  apiKey: process.env.SPIDER_API_KEY!,
});

await spider.connect();
const page = spider.page!;
await page.goto("https://www.imdb.com/chart/top/");
await page.content();

const data = await page.evaluate(`(() => {
  const movies = [];
  const headings = [...document.querySelectorAll("h3")].filter(h =>
    h.closest("a[href*='/title/']") || h.parentElement?.querySelector("a[href*='/title/']")
  );
  headings.forEach(h3 => {
    const title = h3.textContent?.trim();
    const link = h3.closest("a[href*='/title/']") || h3.parentElement?.querySelector("a[href*='/title/']");
    const href = link?.getAttribute("href");
    const container = h3.closest("li");
    let year = "";
    if (container) {
      container.querySelectorAll("span").forEach(s => {
        if (/^\\d{4}$/.test(s.textContent?.trim() || "")) year = s.textContent.trim();
      });
    }
    const rating = container?.querySelector("[aria-label*='rating' i]")?.getAttribute("aria-label") || "";
    if (title) movies.push({ title, href, year, rating });
  });
  return JSON.stringify({ total: movies.length, movies: movies.slice(0, 20) });
})()`);

console.log(JSON.parse(data));
await spider.close();
ready to run · spider-browser · TypeScript
Extraction

Fields you can pull.

TitleRatingYearDirectorCastGenreRuntimeBox office
Metadata

Rich data extraction

Extract titles, view counts, and engagement metrics from imdb.com.

Rendering

Dynamic content

Handle lazy-loaded comments, recommendations, and infinite scroll.

Scale

Channel-level scraping

Process entire channels and playlists with automatic pagination.

Related

More Media scrapers.

Start

Start scraping imdb.com.

Grab an API key and call the endpoint above. The first request resolves the config; every request after hits cache.