This script splits a PDF file based on its bookmarks, creating separate PDF files and organizing them into directories.
- Top-level bookmark splitting: Creates separate PDFs for each top-level bookmark with filename prefix "_"
- Directory creation: Creates folders with the same name as top-level bookmarks with prefix "_"
- Sub-level bookmark splitting: Creates separate PDFs for each sub-level bookmark with filename prefix "_<page_number>"
- Sub-level directory creation: Creates directories for each sub-level bookmark within their top-level directories
- File organization: Moves sub-level PDFs to their respective sub-level directories
- Page splitting: Splits each sub-level PDF into individual pages and stores them in "_pages" directories
- Python 3.6+
- PyMuPDF (fitz)
- tkinter (usually included with Python)
- Install the required dependencies:
pip install -r requirements.txt
python pdf_bookmark_splitter.py
- A file dialog will open - select your PDF file
- The script will process the PDF and create directories in the same location as the PDF
python pdf_bookmark_splitter.py "path/to/your/file.pdf"
- The script will process the PDF and create directories in the same location as the PDF
python pdf_bookmark_splitter.py --help
# or
python pdf_bookmark_splitter.py -h
- Shows detailed usage information and examples
The script will create the following structure:
├── _TopLevelBookmark1/
│ ├── _TopLevelBookmark1.pdf
│ ├── _SubBookmark1/
│ │ ├── _1_SubBookmark1.pdf
│ │ └── _pages/
│ │ ├── page_001.pdf
│ │ ├── page_002.pdf
│ │ └── ...
│ ├── _SubBookmark2/
│ │ ├── _4_SubBookmark2.pdf
│ │ └── _pages/
│ │ ├── page_001.pdf
│ │ ├── page_002.pdf
│ │ └── ...
│ └── ...
├── _TopLevelBookmark2/
│ ├── _TopLevelBookmark2.pdf
│ ├── _SubBookmark1/
│ │ ├── _43_SubBookmark1.pdf
│ │ └── _pages/
│ │ ├── page_001.pdf
│ │ ├── page_002.pdf
│ │ └── ...
│ └── ...
└── ...
- Bookmark Analysis: The script reads the PDF's table of contents and separates top-level (level 1) and sub-level (level 2) bookmarks
- Top-level Splitting: Creates separate PDFs for each top-level bookmark, spanning from the bookmark's page to the next top-level bookmark
- Directory Creation: Creates directories named after each top-level bookmark
- Sub-level Splitting: Creates separate PDFs for each sub-level bookmark with page number prefix
- File Organization: Moves sub-level PDFs to the appropriate top-level directory based on page ranges
- The script automatically cleans filenames by removing invalid characters
- All created files and directories have the "_" prefix as requested
- Sub-level PDFs include the page number in their filename for easy identification
- The script handles edge cases like the last bookmark extending to the end of the document
- Important: All output files and directories are created in the same directory as the input PDF file
- The script includes error handling and user-friendly message boxes for feedback