Skip to content

Add split command to CLI for splitting PDF files by page ranges#2

Draft
Copilot wants to merge 2 commits into
masterfrom
copilot/add-split-command-cli
Draft

Add split command to CLI for splitting PDF files by page ranges#2
Copilot wants to merge 2 commits into
masterfrom
copilot/add-split-command-cli

Conversation

Copy link
Copy Markdown

Copilot AI commented Oct 12, 2025

This PR implements a new split command for the PrivatePdfConverter CLI tool that allows users to split PDF files into smaller PDFs based on page ranges or individual pages.

Features

The new split command supports flexible page specification formats:

# Split by page ranges
ppc split --path "input.pdf" --pages "1-5,10-15" --output "part"
# Creates: part_1-5.pdf, part_10-15.pdf

# Split individual pages  
ppc split --path "input.pdf" --pages "1,3,5" --output "page"
# Creates: page_1.pdf, page_3.pdf, page_5.pdf

# Split all pages individually
ppc split --path "input.pdf" --pages "all" --output "single"
# Creates: single_1.pdf, single_2.pdf, etc.

Implementation Details

  • Command Class: Added SplitPdf.cs following existing codebase patterns
  • Page Range Parser: Robust parsing logic handles ranges ("1-5"), individual pages ("1,3,5"), mixed formats ("1-3,5,8-10"), and special "all" keyword
  • Error Handling: Validates page ranges against actual PDF page count, checks file existence, and ensures PDF format
  • Output Naming: Supports custom output patterns with automatic PDF extension handling
  • Logging: Comprehensive Serilog integration following existing patterns
  • CLI Integration: Properly registered command with help text

Testing

Added comprehensive integration test suite (SplitPdfIntegrationTests.cs) with 8 test cases covering:

  • Valid scenarios: page ranges, individual pages, mixed formats, "all" keyword
  • Error scenarios: invalid ranges, missing files, non-PDF files
  • Output validation: correct file creation and page counts

All existing tests continue to pass (17/17 total tests).

Technical Approach

The implementation leverages the existing iText PDF library used throughout the codebase and follows established patterns for command structure, error handling, and logging. The page range parsing is designed to be intuitive and handles edge cases gracefully with clear error messages.

Original prompt

Implement a new split command for the CLI tool that allows users to split PDF files into smaller PDFs based on page ranges or individual pages.

Requirements:

Command Interface

  • Add a new split command to the CLI
  • Support splitting by page ranges (e.g., "1-5", "10-15")
  • Support splitting individual pages (e.g., "1", "3", "7")
  • Support comma-separated combinations (e.g., "1-3,5,8-10")
  • Allow custom output naming pattern

Command Syntax

ppc split --path "input.pdf" --pages "1-5,10-15" --output "part"
# Should create: part_1-5.pdf, part_10-15.pdf

ppc split --path "input.pdf" --pages "1,3,5" --output "page"  
# Should create: page_1.pdf, page_3.pdf, page_5.pdf

ppc split --path "input.pdf" --pages "all" --output "single"
# Should create individual PDFs for each page: single_1.pdf, single_2.pdf, etc.

Implementation Details

  1. Create a new SplitPdf.cs command class in the Commands folder
  2. Add the command registration in Program.cs
  3. Use existing PDF library (likely iTextSharp or similar) for PDF manipulation
  4. Follow the existing code patterns and logging style
  5. Add proper error handling for:
    • Invalid page ranges
    • Non-existent pages
    • File access issues
    • Invalid PDF files

Error Handling

  • Validate page ranges against actual PDF page count
  • Handle invalid page numbers gracefully
  • Provide clear error messages with suggestions
  • Log operations with Serilog following existing patterns

Output Naming

  • Default output pattern: {filename}_pages_{range}.pdf
  • Support custom naming with --output parameter
  • Ensure no file overwrites without confirmation

The implementation should follow the existing codebase patterns, use the same logging approach, and integrate seamlessly with the current CLI structure.

This pull request was created as a result of the following prompt from Copilot chat.

Implement a new split command for the CLI tool that allows users to split PDF files into smaller PDFs based on page ranges or individual pages.

Requirements:

Command Interface

  • Add a new split command to the CLI
  • Support splitting by page ranges (e.g., "1-5", "10-15")
  • Support splitting individual pages (e.g., "1", "3", "7")
  • Support comma-separated combinations (e.g., "1-3,5,8-10")
  • Allow custom output naming pattern

Command Syntax

ppc split --path "input.pdf" --pages "1-5,10-15" --output "part"
# Should create: part_1-5.pdf, part_10-15.pdf

ppc split --path "input.pdf" --pages "1,3,5" --output "page"  
# Should create: page_1.pdf, page_3.pdf, page_5.pdf

ppc split --path "input.pdf" --pages "all" --output "single"
# Should create individual PDFs for each page: single_1.pdf, single_2.pdf, etc.

Implementation Details

  1. Create a new SplitPdf.cs command class in the Commands folder
  2. Add the command registration in Program.cs
  3. Use existing PDF library (likely iTextSharp or similar) for PDF manipulation
  4. Follow the existing code patterns and logging style
  5. Add proper error handling for:
    • Invalid page ranges
    • Non-existent pages
    • File access issues
    • Invalid PDF files

Error Handling

  • Validate page ranges against actual PDF page count
  • Handle invalid page numbers gracefully
  • Provide clear error messages with suggestions
  • Log operations with Serilog following existing patterns

Output Naming

  • Default output pattern: {filename}_pages_{range}.pdf
  • Support custom naming with --output parameter
  • Ensure no file overwrites without confirmation

The implementation should follow the existing codebase patterns, use the same logging approach, and integrate seamlessly with the current CLI structure.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: jurczewski <24575781+jurczewski@users.noreply.github.com>
@sonarqubecloud
Copy link
Copy Markdown

Copilot AI changed the title [WIP] Add split command to CLI tool for PDFs Add split command to CLI for splitting PDF files by page ranges Oct 12, 2025
Copilot AI requested a review from jurczewski October 12, 2025 17:57
@jurczewski jurczewski force-pushed the master branch 2 times, most recently from dc03c75 to b7b324f Compare January 17, 2026 00:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants