A Camunda Outbound Connector for PDF document manipulation, providing split and merge operations for workflow automation.
This connector simplifies document management by allowing users to easily split and merge PDF files directly within their Camunda workflows. It provides an out-of-the-box solution that saves time, reduces manual errors, and enhances productivity in document processing scenarios.
In many IDP (Intelligent Document Processing) scenarios, users deal with scanned documents containing multiple forms or files within a single PDF. These documents need to be split by page to isolate individual forms or sections for accurate classification. Once classified, users need to merge the classified documents back together by type to create organized, easily accessible files. This connector automates these workflows, eliminating manual processing bottlenecks.
- ✅ Split PDFs by page (individual pages or N pages per file)
- ✅ Split PDFs by page ranges (e.g., "1-3,5-7,10-15")
- ✅ Split PDFs by bookmarks (top-level bookmarks only)
- ✅ Split PDFs by file size (iteratively adds pages until size limit)
- ✅ Merge multiple PDF files into a single document
This connector uses the Operations API approach with OutboundConnectorProvider, allowing multiple PDF operations within a single connector without manual routing.
Key Technologies:
- Apache PDFBox 3.0.3 for PDF manipulation
- Camunda Connector SDK 8.8.3
- Java 21
Operations Implementation: io.camunda.example.operations.PdfConnectorProvider
The PDF Merge & Split Connector integrated into Camunda Web Modeler, showing the element template with operation configuration.
User task displaying the split PDF documents with Camunda's built-in document preview capability.
You can package the Connector by running the following command:
mvn clean packageThis will create the following artifacts:
- A thin JAR without dependencies.
- A fat JAR containing all dependencies, potentially shaded to avoid classpath conflicts. This will not include the SDK
artifacts since those are in scope
providedand will be brought along by the respective Connector Runtime executing the Connector.
You can use the maven-shade-plugin defined in the Maven configuration to relocate common dependencies
that are used in other Connectors and
the Connector Runtime.
This helps to avoid classpath conflicts when the Connector is executed.
For example, without shading, you might encounter errors like:
java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.setserializationInclusion(Lcom/fasterxml/jackson/annotation/JsonInclude$Include;)Lcom/fasterxml/jackson/databind/ObjectMapper;
This occurs when your connector and the runtime use different versions of the same library (e.g., Jackson).
Use the relocations configuration in the Maven Shade plugin to define the dependencies that should be shaded.
The Maven Shade documentation
provides more details on relocations.
The PDF Merge & Split Connector provides five powerful operations for PDF document manipulation:
Combines multiple PDF files into a single document.
Input:
documents: List of PDF documents to merge (in order)outputFilename: Name for the merged PDF (default: "merged.pdf")
Output:
{
"mergedDocument": "<Document>",
"totalPages": 50,
"sourceDocumentCount": 3,
"fileSizeBytes": 1048576
}Splits a PDF into multiple files based on pages per file.
Input:
document: The PDF document to splitpagesPerFile: Number of pages per output file (1 = one page per file)outputPattern: Pattern for output filenames (use{index}for file number, e.g., "page-{index}.pdf")
Output:
{
"splitDocuments": ["<Document>", "<Document>", "..."],
"totalFiles": 10,
"originalPages": 50,
"splitMethod": "BY_PAGE"
}Splits a PDF into multiple files based on specified page ranges.
Input:
document: The PDF document to splitpageRanges: Comma-separated page ranges (e.g., "1-3,5-7,10-15"). Pages are 1-indexed.outputPattern: Pattern for output filenames (use{index},{start},{end})
Output:
{
"splitDocuments": ["<Document>", "<Document>", "..."],
"totalFiles": 3,
"originalPages": 50,
"splitMethod": "BY_RANGE"
}Splits a PDF into separate files based on document bookmarks.
Input:
document: The PDF document to splittopLevelOnly: Split only by top-level bookmarks, ignoring nested bookmarks (default: true)outputPattern: Pattern for output filenames (use{bookmark}for bookmark title,{index}for number)
Output:
{
"splitDocuments": ["<Document>", "<Document>", "..."],
"totalFiles": 5,
"originalPages": 50,
"splitMethod": "BY_BOOKMARK"
}Splits a PDF into multiple files based on target file size. Pages are added iteratively until the size limit is approached.
Input:
document: The PDF document to splitmaxFileSizeMb: Target maximum file size in megabytes (1-100 MB)outputPattern: Pattern for output filenames (use{index}for file number)
Output:
{
"splitDocuments": ["<Document>", "<Document>", "..."],
"totalFiles": 8,
"originalPages": 50,
"splitMethod": "BY_SIZE"
}| Code | Description |
|---|---|
| PDF_MERGE_ERROR | Failed to merge PDF documents |
| PDF_SPLIT_ERROR | Failed to split PDF document |
| NO_BOOKMARKS | PDF document does not contain bookmarks |
| INVALID_PAGE_RANGE | Invalid page range specification |
- Cause: The uploaded file is not a valid PDF or is corrupted
- Solution: Verify the file is a valid PDF format. Try opening it with a PDF reader to confirm it's not corrupted.
- Cause: Processing very large PDF files (>500 pages or >100MB)
- Solution:
- Increase JVM heap size:
-Xmx2gor higher - Consider splitting large operations into smaller batches
- Use split-by-size operation to handle large documents in chunks
- Increase JVM heap size:
WARNING: Class file version 69 does not match expected version
- Cause: JaCoCo 0.8.12 shows warnings with Java 21 class files
- Impact: Non-blocking - tests run successfully, coverage is calculated correctly
- Solution: These warnings can be safely ignored
- Cause: Element template not properly installed
- Solution:
- Verify
element-templates/pdf-connector.jsonexists - Copy to Modeler's element templates directory
- Restart Camunda Modeler
- Check Modeler logs for template loading errors
- Verify
- Cause: JAR not properly deployed or SPI configuration missing
- Solution:
- Verify JAR is in connector runtime classpath
- Check
META-INF/services/io.camunda.connector.api.outbound.OutboundConnectorProviderexists - Review connector runtime logs for loading errors
- Merge Operations: Merging 50+ PDFs may take 10-30 seconds depending on file sizes
- Split Operations: Splitting 100+ pages typically completes in under 30 seconds
- Memory Usage: Each operation loads PDFs into memory; plan for ~2-3x PDF size in heap
- Concurrent Operations: Connector is stateless and supports concurrent execution
- File Size Limits: Split-by-size operation enforces 1-100 MB limit per file
- Page Count Validation: Operations validate page numbers are within document bounds
- Content Validation: PDFBox performs format validation during loading
- No External Dependencies: All PDF operations are self-contained with no external API calls
To help you get started quickly, we've included ready-to-use BPMN process examples in the examples/ directory:
- PDF Merge.bpmn - Complete workflow demonstrating how to merge multiple PDFs
- PDF Split by Page.bpmn - Process showing how to split a PDF into individual pages
- PDF Upload.form - User form for uploading PDF files
- PDF Viewer.form - Form for viewing split PDF results
- PDF Merge Viewer.form - Form for previewing merged PDF documents
These examples can be imported directly into Camunda Web Modeler or Desktop Modeler and deployed to your Camunda cluster.
Run unit tests
mvn clean verifyYou can run the unit and integration tests by executing the following Maven command:
mvn clean verifyYou will need the following tools installed on your machine:
-
Camunda Modeler, which is available in two variants:
- Desktop Modeler for a local installation.
- Web Modeler for an online experience.
-
Docker, which is required to run the Camunda platform.
The Connectors Runtime requires a running Camunda platform to interact with. To set up a local Camunda environment, follow these steps:
- Clone the Camunda distributions repository from GitHub and navigate to the Camunda 8.8 docker-compose directory:
git clone git@github.com:camunda/camunda-distributions.git
cd cd docker-compose/versions/camunda-8.8Note: This template is compatible with Camunda 8.8. Using other versions may lead to compatibility issues.
Either comment out the connectors service, or use the --scale flag to exclude it:
docker compose -f docker-compose-core.yaml up --scale connectors=0Add the element-templates/pdf-connector.json to your Modeler configuration as per
the Element Templates documentation.
- Run
io.camunda.example.classic.LocalConnectorRuntimeto start your connector. - Open the Camunda Desktop Modeler and create a new BPMN diagram.
- Design a process that incorporates the PDF connector.
- Deploy the process to your local Camunda platform.
- Verify that the process is running smoothly by accessing Camunda Operate at localhost:8088/operate. Username and password are both
demo.
To keep this repository safe when testing locally and contributing:
src/main/resources/application.propertiesis ignored by git; useapplication.properties.templateas a starting point and never commit real credentials.- A pre-commit hook is provided to block accidental secret commits.
Enable the hook once per clone:
git config core.hooksPath .githooksYou can temporarily bypass (not recommended) with git commit --no-verify or by setting SKIP_SECRET_CHECK=1 in your environment.
Additionally, a Gitleaks workflow runs on push/PR to scan for secrets using .gitleaks.toml. To run Gitleaks locally:
# Install (one-time); see https://github.com/gitleaks/gitleaks/releases
gitleaks detect --config .gitleaks.toml --redact --verbose- In GitHub Actions, run
Secret Scan (Full History)to scan the entire Git history. - The run uploads a redacted JSON report artifact and opens/updates an issue summarizing findings.
- See
SECURITY.mdfor reporting guidance.
- Docker and Docker Compose installed
- Camunda SaaS cluster with API credentials
- Built connector JAR (
mvn clean package)
-
Create configuration file from template:
cp docker-compose.yml.template docker-compose.yml
-
Update credentials in
docker-compose.yml:environment: CAMUNDA_CLIENT_AUTH_CLIENT_ID: <your-client-id> CAMUNDA_CLIENT_AUTH_CLIENT_SECRET: <your-client-secret> CAMUNDA_CLIENT_CLOUD_CLUSTER_ID: <your-cluster-id> CAMUNDA_CLIENT_CLOUD_REGION: <your-region> # ... additional Zeebe configuration
-
Build and run the connector:
docker-compose up --build -d
-
Monitor logs:
docker-compose logs -f pdf-connector
-
Stop the connector:
docker-compose down
- Container fails with "Operation ID is missing": Ensure your BPMN service task has the element template applied with an operation selected
- Connection refused: Verify Camunda SaaS credentials and cluster region
- Out of memory: Large PDF processing may require increased Docker memory limits
To connect the connector to your Camunda SaaS cluster, you'll need API credentials:
- Navigate to Camunda SaaS Console.
- Create a cluster using Camunda 8.8 or later.
- Select your cluster, go to the
APItab, and clickCreate new Client. - Ensure the
Zeebescope is selected, then clickCreate. - Copy the generated credentials (Client ID, Client Secret, Cluster ID, Region).
-
Create configuration file from template:
cp src/main/resources/application.properties.template src/main/resources/application.properties
-
Update credentials in
application.properties:camunda.client.mode=saas camunda.client.auth.client-id=<your-client-id> camunda.client.auth.client-secret=<your-client-secret> camunda.client.cloud.cluster-id=<your-cluster-id> camunda.client.cloud.region=<your-region>
-
Run the connector:
mvn spring-boot:run -Dspring-boot.run.main-class=io.camunda.example.classic.LocalConnectorRuntime
- Access Camunda Web Modeler.
- Create a new project or open an existing one.
- Click
Create new→Upload files, and uploadelement-templates/pdf-connector.json. - Publish the element template by clicking the Publish button.
- Create a new BPMN diagram in the same folder.
- Add a Service Task and apply the "PDF Merge & Split Connector" template.
- Configure the desired operation and parameters.
- Deploy and start your process.
application.properties or docker-compose.yml with credentials to version control. Use the provided .template files as a starting point.
The element template for this connector is generated automatically based on the connector input class using the Element Template Generator.
The generation is embedded in the Maven build and can be triggered by running mvn clean package.
The generated element template can be found in element-templates/pdf-connector.json.
Version 1.3.1
Changes:
- ✅ CI hardening - Fixed Gitleaks workflow inputs, corrected
.gitleaks.tomlquoting - ✅ Coverage - Upload JaCoCo HTML report; enforcement moved to optional profile (
-Pcoverage-enforced) - ✅ Docker - base image aligned to
camunda/connectors:8.8.3 - ✅ Docs - README polishing and workflow cleanup
Version 1.3.0
Changes:
- ✅ Comprehensive test coverage expansion - Test suite increased from 14 to 33 tests (135% increase)
- ✅ Error handling tests - Added 11 tests covering corrupted PDFs, invalid ranges, boundary conditions
- ✅ Performance validation - Added 8 load tests validating large file handling (up to 500 pages)
- ✅ Code coverage enforcement - JaCoCo plugin configured with 80% instruction and 75% branch coverage thresholds
- ✅ Integration test framework - Full Camunda runtime integration test created (disabled by default for CI speed)
- ✅ CI/CD improvements - GitHub Actions workflow added for automated build and test verification
- ✅ Community contribution guide - CONTRIBUTING.md added with development standards and guidelines
Version 1.2.0
Changes:
- ✅ Simplified merge operation (removed page size standardization and bookmark preservation)
- ✅ Reduced API complexity for better reliability
- ✅ Comprehensive test coverage added (14 unit tests)
- ✅ Version synchronized with GitHub releases
Version 1.1.0
New Features:
- ✅ Split PDFs by page, range, or bookmark
- ✅ Split PDFs by file size
- ✅ Merge multiple PDF files into a single document
Bug Fixes:
- ✅ Fixed content corruption when merging PDFs (now uses PDFMergerUtility)