Convert Substack posts to clean, Obsidian-friendly Markdown using your authenticated browser session.
Substack doesn't let you bulk-export your reading list or subscriptions in a useful format. This tool:
- Uses your logged-in browser via Chrome DevTools Protocol (CDP)
- Preserves frontmatter metadata
- Converts images/embeds to links (Obsidian-friendly)
- Rewrites cross-references as wikilinks
[[YYYY-MM-DD-slug]] - Organizes by publication into folders
- No password management - Uses your live browser session
- Batch processing - Single URLs or text files with multiple URLs
- Sequential with delays - Configurable sleep between requests to be polite
- Obsidian wikilinks - Auto-converts internal links to existing notes
- Configurable naming - Map publication slugs to custom directory names
- Transcript cleaning - Strips timestamps and speaker labels from podcast transcripts
# Clone the repo
git clone https://github.com/yourusername/substack2md.git
cd substack2md
# Install dependencies
pip install -r requirements.txtBrave (Recommended):
open -na "Brave Browser" --args \
--remote-debugging-port=9222 \
--remote-allow-origins=http://127.0.0.1:9222 \
--user-data-dir="$HOME/.brave-cdp-profile"Chrome (Apple Silicon):
arch -arm64 /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
--remote-debugging-port=9222 \
--remote-allow-origins=http://127.0.0.1:9222 \
--user-data-dir="$HOME/.chrome-cdp-profile"Chrome (Intel):
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome \
--remote-debugging-port=9222 \
--remote-allow-origins=http://127.0.0.1:9222 \
--user-data-dir="$HOME/.chrome-cdp-profile"In the browser window that just opened, navigate to Substack and log in normally.
Single URL:
python substack2md.py https://natesnewsletter.substack.com/p/latest-postMultiple URLs from file:
python substack2md.py --urls-file my-reading-list.txtSpecify output directory:
python substack2md.py https://daveshap.substack.com/p/post-slug --base-dir ~/my-notes# Set default base directory
export SUBSTACK2MD_BASE_DIR=~/Documents/substack-notes
# Set config file location
export SUBSTACK2MD_CONFIG=~/.config/substack2md/config.yamlCreate config.yaml in the script directory or specify with --config:
# Base directory for markdown output
base_dir: ~/Documents/substack-notes
# Map publication slugs to custom directory names
publication_mappings:
signalsandsubtractions: Signals_And_Subtractions
natesnewsletter: Nates_Notes
daveshap: David_ShapiroSee config.yaml.example for a template.
# Single post with custom output directory
python substack2md.py https://pub.substack.com/p/slug --base-dir ~/vault
# Batch processing with slower delays (be nice to servers)
python substack2md.py --urls-file urls.txt --sleep-ms 500
# Save HTML alongside markdown (for debugging)
python substack2md.py URL --also-save-html
# Overwrite existing files
python substack2md.py URL --overwrite
# Process from existing markdown export (cleanup only)
python substack2md.py --from-md export.md --url https://pub.substack.com/p/slugCreate a text file with one URL per line:
https://signalsandsubtractions.substack.com/p/the-trust-gap
https://natesnewsletter.substack.com/p/i-surveyed-100-ai-tools-that-launched
# Comments start with #
https://daveshap.substack.com/p/the-merits-of-doing-things-the-hard
~/Documents/substack-notes/
├── Signals_And_Subtractions/
│ └── 2025-09-29-the-trust-gap.md
├── Nates_Notes/
│ ├── 2025-10-20-i-surveyed-100-ai-tools-that-launched.md
│ └── 2025-10-18-i-read-17-hours-of-ai-news-this-week.md
└── David_Shapiro/
└── 2025-10-18-the-merits-of-doing-things-the-hard.md
Each file includes YAML frontmatter:
---
title: "Post Title"
subtitle: "Optional subtitle"
author: "David Shapiro"
publication: "daveshap"
published: "2025-10-18"
updated: "2025-10-18"
retrieved: "2025-10-20T15:30:00Z"
url: "https://daveshap.substack.com/p/post-slug"
canonical: "https://daveshap.substack.com/p/post-slug"
slug: "post-slug"
tags: [substack, ai, automation]
image: "https://substackcdn.com/image.jpg"
links_internal: 3
links_external: 12
source: "substack2md v1.1.0"
---
Content starts here...- Make sure your browser launched with
--remote-debugging-port=9222 - Check that no other process is using port 9222
- Try closing all Chrome/Brave windows and launching again
pip install -r requirements.txt- The tool only converts links to posts you've already downloaded
- Run a second pass to catch cross-references
- Increase
--sleep-ms(default: 150ms) - Use smaller batches
- Substack shouldn't rate-limit authenticated sessions, but YMMV
python substack2md.py --helpoptions:
--urls-file FILE File with URLs, one per line
--from-md FILE Clean existing markdown export
--url URL URL for --from-md mode
--base-dir DIR Output directory
--config FILE Path to config.yaml
--also-save-html Save HTML sidecar files
--overwrite Replace existing files
--cdp-host HOST CDP hostname (default: 127.0.0.1)
--cdp-port PORT CDP port (default: 9222)
--timeout SECONDS Page load timeout (default: 45)
--retries N Retry failed URLs N times (default: 2)
--sleep-ms MS Delay between requests (default: 150)
Pull requests welcome! Areas for improvement:
- Support for other platforms (Medium, Ghost, etc.)
- Better error handling
- Progress bars for batch processing
- Parallel processing option
- Export to other formats
MIT License - see LICENSE file for details.
Built with:
This tool is for personal archival purposes. Respect content creators' rights and Substack's terms of service. DON'T STEAL! STEALING IS BAD BAD BAD!!! Getting better utility from Substacks you already support is not. Sharing without permission is the line, don't cross it.