A .NET-based tool for PDF processing, dotnetpdf draws inspiration from PDFtk server.
dotnetpdf has been refactored into a modular architecture:
- CLI Application (
dotnet.pdf) - Command-line interface for PDF operations - Core Library (
DotNet.Pdf.Core) - Reusable PDF processing services- Service-oriented architecture with dedicated classes for each PDF operation
- Full dependency injection support
- Comprehensive logging and error handling
- Thread-safe operations
PdfTextExtractionService- Extract text from PDF documentsPdfBookmarkService- Process PDF bookmarks and outlinesPdfInformationService- Extract document metadataPdfAttachmentService- Handle PDF attachmentsPdfPageObjectService- Analyze page objectsPdfFormFieldService- Inspect form fieldsPdfWatermarkService- Add watermarks to documents
dotnetpdf commands
split: Split a single PDF into multiple files.merge: Combine multiple PDFs into one file.convert: Convert PDF pages to images.imagetopdf: Convert images to a PDF file.text: Extract text from a PDF.bookmarks: Extract PDF bookmarks (outlines).info: Retrieves PDF metadata.rotate: Rotate PDF pages by 90, 180, or 270 degrees.remove: Remove specific pages from a PDF.insert: Insert blank pages into a PDF at specified positions.reorder: Reorder PDF pages according to a specified sequence.list-attachments: List PDF attachments with metadata information.extract-attachments: Extract PDF attachments to disk.list-objects: List all graphical objects on a given page.list-forms: List all interactive form fields in a document.watermark: Add a text or image watermark to all pages of a document.
Install dotnetpdf as a .NET global tool:
dotnet tool install --global Emm.DotnetPdf# Split a PDF using autogenerated names
dotnetpdf split --input <input.pdf>
# Split a PDF specifying output name
dotnetpdf split --input <input.pdf> -names '{page}_{original}_pdf'
[README.md](..%2FREADME.md)
# Split a PDF using bookmarks as output names, specifying range
dotnetpdf split --input <input.pdf> --use-bookmarks --range 1-5
# You can also create a text file to specify the output filenames, one filename per pdf page
dotnetpdf split --input <input.pdf> --output-script <script.txt>
# Merge PDFs
dotnetpdf merge --output <output> --input <input1> --input <input2>
# Merge Directory PDF's
dotnetpdf merge --output <output> --input-directory <directory> --recursive false
# Convert PDF to images
dotnetpdf convert --input <input.pdf> --output <directory> --range 1-5 --encoder .png --dpi 100
# Convert image to PDF
dotnetpdf imagetopdf --input <input.png> --output <output.pdf>
# Print PDF text to stdout
dotnetpdf text --input <input.pdf> --format text
# Print PDF text to stdout as json
dotnetpdf text --input <input.pdf> --format json
# Print PDF Bookmarks to stdout
dotnetpdf bookmarks --input <input.pdf>
# Extract PDF Information
dotnetpdf info --input <input.pdf> --format json
# Rotate PDF pages (90, 180, or 270 degrees)
dotnetpdf rotate --input <input.pdf> --output <rotated.pdf> --rotation 180
# Rotate specific pages only
dotnetpdf rotate --input <input.pdf> --output <rotated.pdf> --range 1-3 --rotation 90
# Remove specific pages from PDF
dotnetpdf remove --input <input.pdf> --output <cleaned.pdf> --pages 2,4,6
# Insert blank pages at specified positions
dotnetpdf insert --input <input.pdf> --output <expanded.pdf> --positions 1:2,5:1
# Insert blank pages with custom dimensions (in points)
dotnetpdf insert --input <input.pdf> --output <expanded.pdf> --positions 3:1 --width 595 --height 842
# Reorder PDF pages
dotnetpdf reorder --input <input.pdf> --output <reordered.pdf> --order 3,1,2,4
# List PDF attachments
dotnetpdf list-attachments --input <input.pdf>
# List PDF attachments in JSON format
dotnetpdf list-attachments --input <input.pdf> --format json
# Extract all PDF attachments
dotnetpdf extract-attachments --input <input.pdf> --output <output-directory>
# Extract specific attachment by index
dotnetpdf extract-attachments --input <input.pdf> --output <output-directory> --index 0
# List all objects on page 1
dotnetpdf list-objects --input <input.pdf> --page 1
# List all form fields in a document as JSON
dotnetpdf list-forms --input <input.pdf> --format json
# Add a text watermark
dotnetpdf watermark --input <input.pdf> --output <watermarked.pdf> --text "CONFIDENTIAL"
# Add an image watermark with custom options
dotnetpdf watermark --input <input.pdf> --output <watermarked.pdf> --image <logo.png> --scale 0.5 --opacity 128
# Print Help
dotnetpdf --helpThe core functionality is available as a reusable library:
using DotNet.Pdf.Core;
using Microsoft.Extensions.Logging;
// Setup logging
var loggerFactory = LoggerFactory.Create(builder => builder.AddConsole());
// Create PDF processor
var pdfProcessor = new PdfProcessor(loggerFactory);
// Extract text
var texts = pdfProcessor.GetPdfText("document.pdf", pageRange: null, password: "");
foreach (var pageText in texts)
{
Console.WriteLine($"Page {pageText.Page}: {pageText.Text}");
}
// Get document information
var info = pdfProcessor.GetPdfInformation("document.pdf", password: "");
Console.WriteLine($"Title: {info.Title}, Pages: {info.Pages}");
// Extract bookmarks
var bookmarks = pdfProcessor.GetPdfBookmarks("document.pdf", password: "");
foreach (var bookmark in bookmarks)
{
Console.WriteLine($"Level {bookmark.Level}: {bookmark.Title}");
}// In ASP.NET Core or Generic Host
services.AddSingleton<PdfProcessor>(provider =>
{
var loggerFactory = provider.GetRequiredService<ILoggerFactory>();
return new PdfProcessor(loggerFactory);
});See MIGRATION.md for detailed migration guide from the old static API to the new service-oriented architecture.