Overview

ghost-scraper is an MCP server for scraping data from Ghost publishing platforms. Ghost is an open-source headless CMS used for blogs and newsletters. This server enables programmatic access to site content, converting web pages into structured JSON data without browser automation.

Key Capabilities

Post and page extraction: Retrieves titles, bodies, excerpts, feature images, and publish dates from individual or listed content.
Author and tag data: Pulls user profiles, bios, and categorization metadata.
Site navigation: Follows sitemaps, RSS feeds, or paginated endpoints to crawl full collections.

Specific tools are not enumerated, but the server exposes scraping endpoints tailored to Ghost's API-like structure and frontend renders.

Use Cases

Content migration: Scrape all posts from a Ghost site (extract_posts) to import into another CMS like WordPress.
Dataset creation: Collect articles, tags, and authors from multiple Ghost blogs for training NLP models on publishing content.
Archiving discontinued sites: Pull full histories including metadata to preserve blogs before shutdown.
Competitive analysis: Extract publish frequency and topics from competitor Ghost newsletters.

Who This Is For

Data analysts building corpora from online publications, developers automating content pipelines, researchers studying blog trends, and site admins migrating Ghost instances. Requires basic API integration knowledge.

ghost-scraper

How to pay

Subscribe

Overview

Key Capabilities

Use Cases

Who This Is For