substack2md

Convert Substack posts to clean, Obsidian-friendly Markdown using your authenticated browser session.

Why This Exists

Substack doesn't let you bulk-export your reading list or subscriptions in a useful format. This tool:

Uses your logged-in browser via Chrome DevTools Protocol (CDP)
Preserves frontmatter metadata
Converts images/embeds to links (Obsidian-friendly)
Rewrites cross-references as wikilinks [[YYYY-MM-DD-slug]]
Organizes by publication into folders

Features

No password management - Uses your live browser session
Batch processing - Single URLs or text files with multiple URLs
Sequential with delays - Configurable sleep between requests to be polite
Obsidian wikilinks - Auto-converts internal links to existing notes
Configurable naming - Map publication slugs to custom directory names
Transcript cleaning - Strips timestamps and speaker labels from podcast transcripts

Installation

# Clone the repo
git clone https://github.com/yourusername/substack2md.git
cd substack2md

# Install dependencies
pip install -r requirements.txt

Quick Start

1. Launch Your Browser with Remote Debugging

Brave (Recommended):

open -na "Brave Browser" --args \
  --remote-debugging-port=9222 \
  --remote-allow-origins=http://127.0.0.1:9222 \
  --user-data-dir="$HOME/.brave-cdp-profile"

Chrome (Apple Silicon):

arch -arm64 /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
  --remote-debugging-port=9222 \
  --remote-allow-origins=http://127.0.0.1:9222 \
  --user-data-dir="$HOME/.chrome-cdp-profile"

Chrome (Intel):

/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
  --remote-debugging-port=9222 \
  --remote-allow-origins=http://127.0.0.1:9222 \
  --user-data-dir="$HOME/.chrome-cdp-profile"

2. Log Into Substack

In the browser window that just opened, navigate to Substack and log in normally.

3. Convert Posts

Single URL:

python substack2md.py https://natesnewsletter.substack.com/p/latest-post

Multiple URLs from file:

python substack2md.py --urls-file my-reading-list.txt

Specify output directory:

python substack2md.py https://daveshap.substack.com/p/post-slug --base-dir ~/my-notes

Configuration

Environment Variables

# Set default base directory
export SUBSTACK2MD_BASE_DIR=~/Documents/substack-notes

# Set config file location
export SUBSTACK2MD_CONFIG=~/.config/substack2md/config.yaml

Config File

Create config.yaml in the script directory or specify with --config:

# Base directory for markdown output
base_dir: ~/Documents/substack-notes

# Map publication slugs to custom directory names
publication_mappings:
  signalsandsubtractions: Signals_And_Subtractions
  natesnewsletter: Nates_Notes
  daveshap: David_Shapiro

See config.yaml.example for a template.

Usage Examples

# Single post with custom output directory
python substack2md.py https://pub.substack.com/p/slug --base-dir ~/vault

# Batch processing with slower delays (be nice to servers)
python substack2md.py --urls-file urls.txt --sleep-ms 500

# Save HTML alongside markdown (for debugging)
python substack2md.py URL --also-save-html

# Overwrite existing files
python substack2md.py URL --overwrite

# Process from existing markdown export (cleanup only)
python substack2md.py --from-md export.md --url https://pub.substack.com/p/slug

URL File Format

Create a text file with one URL per line:

https://signalsandsubtractions.substack.com/p/the-trust-gap
https://natesnewsletter.substack.com/p/i-surveyed-100-ai-tools-that-launched
# Comments start with #
https://daveshap.substack.com/p/the-merits-of-doing-things-the-hard

Output Structure

~/Documents/substack-notes/
├── Signals_And_Subtractions/
│   └── 2025-09-29-the-trust-gap.md
├── Nates_Notes/
│   ├── 2025-10-20-i-surveyed-100-ai-tools-that-launched.md
│   └── 2025-10-18-i-read-17-hours-of-ai-news-this-week.md
└── David_Shapiro/
    └── 2025-10-18-the-merits-of-doing-things-the-hard.md

Markdown Frontmatter

Each file includes YAML frontmatter:

---
title: "Post Title"
subtitle: "Optional subtitle"
author: "David Shapiro"
publication: "daveshap"
published: "2025-10-18"
updated: "2025-10-18"
retrieved: "2025-10-20T15:30:00Z"
url: "https://daveshap.substack.com/p/post-slug"
canonical: "https://daveshap.substack.com/p/post-slug"
slug: "post-slug"
tags: [substack, ai, automation]
image: "https://substackcdn.com/image.jpg"
links_internal: 3
links_external: 12
source: "substack2md v1.1.0"
---

Content starts here...

Troubleshooting

"No CDP connection"

Make sure your browser launched with --remote-debugging-port=9222
Check that no other process is using port 9222
Try closing all Chrome/Brave windows and launching again

"Missing modules" error

pip install -r requirements.txt

URLs not being converted to wikilinks

The tool only converts links to posts you've already downloaded
Run a second pass to catch cross-references

Rate limiting / bot detection

Increase --sleep-ms (default: 150ms)
Use smaller batches
Substack shouldn't rate-limit authenticated sessions, but YMMV

Advanced Options

python substack2md.py --help

options:
  --urls-file FILE         File with URLs, one per line
  --from-md FILE           Clean existing markdown export
  --url URL                URL for --from-md mode
  --base-dir DIR           Output directory
  --config FILE            Path to config.yaml
  --also-save-html         Save HTML sidecar files
  --overwrite              Replace existing files
  --cdp-host HOST          CDP hostname (default: 127.0.0.1)
  --cdp-port PORT          CDP port (default: 9222)
  --timeout SECONDS        Page load timeout (default: 45)
  --retries N              Retry failed URLs N times (default: 2)
  --sleep-ms MS            Delay between requests (default: 150)

Contributing

Pull requests welcome! Areas for improvement:

Support for other platforms (Medium, Ghost, etc.)
Better error handling
Progress bars for batch processing
Parallel processing option
Export to other formats

License

MIT License - see LICENSE file for details.

Credits

Built with:

Disclaimer

This tool is for personal archival purposes. Respect content creators' rights and Substack's terms of service. DON'T STEAL! STEALING IS BAD BAD BAD!!! Getting better utility from Substacks you already support is not. Sharing without permission is the line, don't cross it.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.yaml.example		config.yaml.example
requirements.txt		requirements.txt
substack2md.py		substack2md.py
urls.txt.example		urls.txt.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

substack2md

Why This Exists

Features

Installation

Quick Start

1. Launch Your Browser with Remote Debugging

2. Log Into Substack

3. Convert Posts

Configuration

Environment Variables

Config File

Usage Examples

URL File Format

Output Structure

Markdown Frontmatter

Troubleshooting

"No CDP connection"

"Missing modules" error

URLs not being converted to wikilinks

Rate limiting / bot detection

Advanced Options

Contributing

License

Credits

Disclaimer

About

Uh oh!

Releases

Packages

Languages

License

snapsynapse/substack2md

Folders and files

Latest commit

History

Repository files navigation

substack2md

Why This Exists

Features

Installation

Quick Start

1. Launch Your Browser with Remote Debugging

2. Log Into Substack

3. Convert Posts

Configuration

Environment Variables

Config File

Usage Examples

URL File Format

Output Structure

Markdown Frontmatter

Troubleshooting

"No CDP connection"

"Missing modules" error

URLs not being converted to wikilinks

Rate limiting / bot detection

Advanced Options

Contributing

License

Credits

Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages