Add --source_parallelism flag to run multiple input sources concurrently#1568
Add --source_parallelism flag to run multiple input sources concurrently#1568yuiseki wants to merge 2 commits into
--source_parallelism flag to run multiple input sources concurrently#1568Conversation
Full logs: https://github.com/onthegomap/planetiler/actions/runs/26627315329 |
- BLOCKER (java:S2095): use try-with-resources for ExecutorService - CODE_SMELL (java:S6885): replace Math.max(1, Math.min(...)) with explicit if-dispatch + plain Math.min, so the sequential vs parallel branch is easier to read No behavior change.
|
|
Thanks for taking the time to make this change! Would it be possible to share full logs from the planetiler run before and after this change with your input data? In general planetiler tries to keep parallelism minimal so that num threads spreads works across all available cores instead of just loading it up with threads, so I want to see exactly what is limiting the parallelism before deviating from that goal. |
|
Thanks for the thoughtful question, and the point about keeping parallelism minimal makes sense to me. I put together a fully reproducible benchmark on open data so you can see exactly what limits the parallelism without needing my private input. What limits the parallelismEach source stage is fed by a single shapefile reader thread (shapefiles are not splittable, so reading a source is inherently single-threaded). When a schema has many small sources whose per-feature work is light, that single reader cannot produce features fast enough to keep In the run below, the 140 source stages average 3.5 active threads out of 31, and the source-reading phase as a whole runs at about 4.7 threads (15% of 31 cores). The total CPU time is essentially identical across Reproducible open-data benchmark7 Geofabrik free shapefile extracts, 20 layers each = 140 shapefile sources, 57,206,689 features, z11-14. 32-core host, JDK, NVMe, warm page cache. Regions: ireland-and-northern-ireland, connecticut, iceland, new-hampshire, rhode-island, luxembourg, vermont (all from https://download.geofabrik.de). Geometries normalized with
Output is byte-identical across all three ( On the cosmetic caveatAs noted in the PR description, per-stage CPU numbers in the logs get mixed once stages overlap, because Full before/after logs, schema, and the download script are all in the gist: https://gist.github.com/yuiseki/8679a16d3dc946a13d13408687cec900 |



Summary
When a custom schema declares many small input sources whose per-feature work is light,
Planetiler.runprocesses them sequentially and cannot saturateprocess_threads. On one such build, the pool runs at avg 2.6 of 31 threads per source (~8% CPU utilization).This PR adds an opt-in
--source_parallelism=Nflag (default1). WithN>1, up toNsource stages run concurrently against a shared executor.Impact
Purely additive. Default preserves the existing sequential behavior bit-for-bit. Output
.pmtilesis MD5-identical acrossN=1,N=4, andN=8on a real multi-source build I use locally.Performance
NVMe, 32-core host, JDK 21:
--source_parallelismHDD, same workload:
N=4is 2.04x (14m 04s vs 6m 54s);N=8regresses to 9m 15s under disk contention.Notes
Per-stage thread CPU breakdown in
stats.jsongets mixed when stages overlap, sinceTimers.currentStageassumes LIFO nesting. Wall-clock per stage, output archive, and progress logs are unaffected. Happy to follow up with a fix in a separate PR if maintainers want it bundled.Defaulting
N>1and auto-tuning are out of scope here; open to either as follow-ups.AI assistance
Per CONTRIBUTING.md#ai-assisted-contributions: drafted with Claude Code. I reviewed every line, ran
spotless:checkandplanetiler-coretests on JDK 21, and confirmed byte-identical output on a real build.