Skip to content

Add --source_parallelism flag to run multiple input sources concurrently#1568

Open
yuiseki wants to merge 2 commits into
onthegomap:mainfrom
yuiseki:add-source-parallelism-flag
Open

Add --source_parallelism flag to run multiple input sources concurrently#1568
yuiseki wants to merge 2 commits into
onthegomap:mainfrom
yuiseki:add-source-parallelism-flag

Conversation

@yuiseki

@yuiseki yuiseki commented May 29, 2026

Copy link
Copy Markdown

Summary

When a custom schema declares many small input sources whose per-feature work is light, Planetiler.run processes them sequentially and cannot saturate process_threads. On one such build, the pool runs at avg 2.6 of 31 threads per source (~8% CPU utilization).

This PR adds an opt-in --source_parallelism=N flag (default 1). With N>1, up to N source stages run concurrently against a shared executor.

Impact

Purely additive. Default preserves the existing sequential behavior bit-for-bit. Output .pmtiles is MD5-identical across N=1, N=4, and N=8 on a real multi-source build I use locally.

Performance

NVMe, 32-core host, JDK 21:

--source_parallelism wall-clock speedup
1 (default) 10m 43s 1.00x
4 4m 01s 2.67x
8 3m 52s 2.77x

HDD, same workload: N=4 is 2.04x (14m 04s vs 6m 54s); N=8 regresses to 9m 15s under disk contention.

Notes

Per-stage thread CPU breakdown in stats.json gets mixed when stages overlap, since Timers.currentStage assumes LIFO nesting. Wall-clock per stage, output archive, and progress logs are unaffected. Happy to follow up with a fix in a separate PR if maintainers want it bundled.

Defaulting N>1 and auto-tuning are out of scope here; open to either as follow-ups.

AI assistance

Per CONTRIBUTING.md#ai-assisted-contributions: drafted with Claude Code. I reviewed every line, ran spotless:check and planetiler-core tests on JDK 21, and confirmed byte-identical output on a real build.

@github-actions

github-actions Bot commented May 29, 2026

Copy link
Copy Markdown
This Branch aa8c283 Base f91cc19
0:01:11 DEB [archive] - Tile stats:
0:01:11 DEB [archive] - Biggest tiles (gzipped)
1. 14/4942/6092 (162k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.40015 (poi:88k)
2. 9/154/190 (149k) https://onthegomap.github.io/planetiler-demo/#9.5/41.77078/-71.36719 (landcover:86k)
3. 10/308/381 (138k) https://onthegomap.github.io/planetiler-demo/#10.5/41.63994/-71.54297 (landcover:72k)
4. 10/308/380 (137k) https://onthegomap.github.io/planetiler-demo/#10.5/41.90214/-71.54297 (landcover:66k)
5. 14/4941/6092 (121k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.42212 (poi:69k)
6. 14/4941/6093 (118k) https://onthegomap.github.io/planetiler-demo/#14.5/41.81227/-71.42212 (poi:62k)
7. 14/4946/6113 (112k) https://onthegomap.github.io/planetiler-demo/#14.5/41.48389/-71.31226 (building:59k)
8. 14/4946/6112 (111k) https://onthegomap.github.io/planetiler-demo/#14.5/41.50035/-71.31226 (building:67k)
9. 14/4940/6092 (102k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.44409 (building:92k)
10. 14/4942/6091 (101k) https://onthegomap.github.io/planetiler-demo/#14.5/41.84501/-71.40015 (building:79k)
0:01:11 DEB [archive] - Max tile sizes
                      z0    z1    z2    z3    z4    z5    z6    z7    z8    z9   z10   z11   z12   z13   z14   all
           boundary  151   336   409   544   802   287   396   490   670  1.6k    2k  6.9k  6.2k  5.6k  4.4k  6.9k
              water 7.7k  3.7k  8.6k  5.5k  2.6k  5.1k   15k   18k   16k   26k   15k   13k   17k   15k   12k   26k
              place    0     0   487   487   487   773   862  1.1k  1.8k  3.3k  6.2k  3.9k    2k   966    1k  6.2k
            landuse    0     0     0     0   549   695  1.6k  6.9k   18k   44k   58k   49k   38k   19k   12k   58k
     transportation    0     0     0     0   355    1k  1.5k  4.6k  6.4k   21k   15k   17k   67k   38k   38k   67k
           waterway    0     0     0     0   112   119     0     0     0  3.3k  2.4k  2.1k  2.1k  4.9k  2.4k  4.9k
               park    0     0     0     0     0     0  1.1k    4k  9.7k   19k   13k  8.2k  3.7k  3.4k  4.4k   19k
transportation_name    0     0     0     0     0     0   293   360  1.1k  1.9k  5.8k  4.8k    4k  3.5k   18k   18k
          landcover    0     0     0     0     0     0     0  9.6k   29k   86k   72k   82k   53k   30k   26k   86k
      mountain_peak    0     0     0     0     0     0     0  1.1k  1.8k  3.4k  4.4k  2.8k  1.4k  1.4k   869  4.4k
         water_name    0     0     0     0     0     0     0     0     0   528   503   475   494  1.2k  1.5k  1.5k
    aerodrome_label    0     0     0     0     0     0     0     0     0     0   666   289   273   221   221   666
            aeroway    0     0     0     0     0     0     0     0     0     0  1.6k    2k    3k  3.3k  2.8k  3.3k
                poi    0     0     0     0     0     0     0     0     0     0     0     0   589   586   88k   88k
           building    0     0     0     0     0     0     0     0     0     0     0     0     0   59k   92k   92k
        housenumber    0     0     0     0     0     0     0     0     0     0     0     0     0     0   35k   35k
          full tile 7.9k    4k  9.5k  6.5k  3.7k  6.4k   21k   41k   85k  203k  185k  135k  114k  120k  255k  255k
            gzipped 6.2k  3.5k  7.1k  5.2k  3.1k    5k   14k   29k   61k  149k  138k   99k   84k   85k  162k  162k
0:01:11 DEB [archive] -    Max tile: 255k (gzipped: 162k)
0:01:11 DEB [archive] -    Avg tile: 5.5k (gzipped: 4.1k) using weighted average based on OSM traffic
0:01:11 DEB [archive] -     # tiles: 4,115,030
0:01:11 DEB [archive] -  # features: 5,779,817
0:01:11 INF [archive] - Finished in 20s cpu:1m13s avg:3.7
0:01:11 INF [archive] -   read    1x(3% 0.6s wait:18s done:1s)
0:01:11 INF [archive] -   encode  4x(54% 11s wait:2s done:1s)
0:01:11 INF [archive] -   write   1x(18% 4s wait:14s done:1s)
0:01:11 INF [archive] - Finished in 1m12s cpu:3m37s gc:1s avg:3
0:01:11 INF [archive] - FINISHED!
0:01:11 INF [archive] - 
0:01:11 INF [archive] - ----------------------------------------
0:01:11 INF [archive] - data errors:
0:01:11 INF [archive] - 	render_snap_fix_input	16,800
0:01:11 INF [archive] - 	osm_multipolygon_missing_way	377
0:01:11 INF [archive] - 	osm_boundary_missing_way	55
0:01:11 INF [archive] - 	merge_snap_fix_input	9
0:01:11 INF [archive] - 	osm_multipolygon_duplicate_member	4
0:01:11 INF [archive] - 	omt_fix_water_before_ne_intersect	2
0:01:11 INF [archive] - 	feature_polygon_osm_invalid_multipolygon_empty_after_fix	2
0:01:11 INF [archive] - 	render_snap_fix_input2	1
0:01:11 INF [archive] - 	omt_park_area_osm_invalid_multipolygon_empty_after_fix	1
0:01:11 INF [archive] - ----------------------------------------
0:01:11 INF [archive] - 	overall          1m12s cpu:3m37s gc:1s avg:3
0:01:11 INF [archive] - 	lake_centerlines 3s cpu:6s avg:2
0:01:11 INF [archive] - 	  read     1x(17% 0.5s done:3s)
0:01:11 INF [archive] - 	  process  4x(0% 0s done:2s)
0:01:11 INF [archive] - 	  write    1x(0% 0s done:2s)
0:01:11 INF [archive] - 	water_polygons   16s cpu:40s avg:2.5
0:01:11 INF [archive] - 	  read     1x(43% 7s done:8s)
0:01:11 INF [archive] - 	  process  4x(21% 3s wait:5s done:6s)
0:01:11 INF [archive] - 	  write    1x(3% 0.5s wait:10s done:6s)
0:01:11 INF [archive] - 	natural_earth    11s cpu:19s avg:1.6
0:01:11 INF [archive] - 	  read     1x(55% 6s done:5s)
0:01:11 INF [archive] - 	  process  4x(7% 0.8s wait:6s done:5s)
0:01:11 INF [archive] - 	  write    1x(0% 0s wait:6s done:5s)
0:01:11 INF [archive] - 	osm_pass1        2s cpu:7s avg:3.3
0:01:11 INF [archive] - 	  read     1x(2% 0s wait:2s)
0:01:11 INF [archive] - 	  parse    4x(37% 0.8s)
0:01:11 INF [archive] - 	  process  1x(65% 1s)
0:01:11 INF [archive] - 	osm_pass2        17s cpu:1m6s avg:3.9
0:01:11 INF [archive] - 	  read     1x(0% 0s wait:10s done:7s)
0:01:11 INF [archive] - 	  process  4x(69% 11s)
0:01:11 INF [archive] - 	  write    1x(3% 0.6s wait:16s)
0:01:11 INF [archive] - 	ne_lakes         0s cpu:0s avg:0
0:01:11 INF [archive] - 	boundaries       0s cpu:0s avg:0
0:01:11 INF [archive] - 	agg_stop         0s cpu:0s avg:0
0:01:11 INF [archive] - 	sort             1s cpu:4s avg:2.4
0:01:11 INF [archive] - 	  worker  1x(53% 0.8s)
0:01:11 INF [archive] - 	archive          20s cpu:1m13s avg:3.7
0:01:11 INF [archive] - 	  read    1x(3% 0.6s wait:18s done:1s)
0:01:11 INF [archive] - 	  encode  4x(54% 11s wait:2s done:1s)
0:01:11 INF [archive] - 	  write   1x(18% 4s wait:14s done:1s)
0:01:11 INF [archive] - ----------------------------------------
0:01:11 INF [archive] - 	archive	109MB
0:01:11 INF [archive] - 	features	298MB
-rw-r--r-- 1 runner runner 87M May 29 08:41 run.jar
0:01:04 DEB [archive] - Tile stats:
0:01:04 DEB [archive] - Biggest tiles (gzipped)
1. 14/4942/6092 (162k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.40015 (poi:88k)
2. 9/154/190 (149k) https://onthegomap.github.io/planetiler-demo/#9.5/41.77078/-71.36719 (landcover:86k)
3. 10/308/381 (138k) https://onthegomap.github.io/planetiler-demo/#10.5/41.63994/-71.54297 (landcover:72k)
4. 10/308/380 (137k) https://onthegomap.github.io/planetiler-demo/#10.5/41.90214/-71.54297 (landcover:66k)
5. 14/4941/6092 (121k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.42212 (poi:69k)
6. 14/4941/6093 (118k) https://onthegomap.github.io/planetiler-demo/#14.5/41.81227/-71.42212 (poi:62k)
7. 14/4946/6113 (112k) https://onthegomap.github.io/planetiler-demo/#14.5/41.48389/-71.31226 (building:59k)
8. 14/4946/6112 (111k) https://onthegomap.github.io/planetiler-demo/#14.5/41.50035/-71.31226 (building:67k)
9. 14/4940/6092 (102k) https://onthegomap.github.io/planetiler-demo/#14.5/41.82864/-71.44409 (building:92k)
10. 14/4942/6091 (101k) https://onthegomap.github.io/planetiler-demo/#14.5/41.84501/-71.40015 (building:79k)
0:01:04 DEB [archive] - Max tile sizes
                      z0    z1    z2    z3    z4    z5    z6    z7    z8    z9   z10   z11   z12   z13   z14   all
           boundary  151   336   409   544   802   287   396   490   670  1.6k    2k  6.9k  6.2k  5.6k  4.4k  6.9k
              water 7.7k  3.7k  8.6k  5.5k  2.6k  5.1k   15k   18k   16k   26k   15k   13k   17k   15k   12k   26k
              place    0     0   487   487   487   773   862  1.1k  1.8k  3.3k  6.2k  3.9k    2k   966    1k  6.2k
            landuse    0     0     0     0   549   695  1.6k  6.9k   18k   44k   58k   49k   38k   19k   12k   58k
     transportation    0     0     0     0   355    1k  1.5k  4.6k  6.4k   21k   15k   17k   67k   38k   38k   67k
           waterway    0     0     0     0   112   119     0     0     0  3.3k  2.4k  2.1k  2.1k  4.9k  2.4k  4.9k
               park    0     0     0     0     0     0  1.1k    4k  9.7k   19k   13k  8.2k  3.7k  3.4k  4.4k   19k
transportation_name    0     0     0     0     0     0   293   360  1.1k  1.9k  5.8k  4.8k    4k  3.5k   18k   18k
          landcover    0     0     0     0     0     0     0  9.6k   29k   86k   72k   82k   53k   30k   26k   86k
      mountain_peak    0     0     0     0     0     0     0  1.1k  1.8k  3.4k  4.4k  2.8k  1.4k  1.4k   869  4.4k
         water_name    0     0     0     0     0     0     0     0     0   528   503   475   494  1.2k  1.5k  1.5k
    aerodrome_label    0     0     0     0     0     0     0     0     0     0   666   289   273   221   221   666
            aeroway    0     0     0     0     0     0     0     0     0     0  1.6k    2k    3k  3.3k  2.8k  3.3k
                poi    0     0     0     0     0     0     0     0     0     0     0     0   589   586   88k   88k
           building    0     0     0     0     0     0     0     0     0     0     0     0     0   59k   92k   92k
        housenumber    0     0     0     0     0     0     0     0     0     0     0     0     0     0   35k   35k
          full tile 7.9k    4k  9.5k  6.5k  3.7k  6.4k   21k   41k   85k  203k  185k  135k  114k  120k  255k  255k
            gzipped 6.2k  3.5k  7.1k  5.2k  3.1k    5k   14k   29k   61k  149k  138k   99k   84k   85k  162k  162k
0:01:04 DEB [archive] -    Max tile: 255k (gzipped: 162k)
0:01:04 DEB [archive] -    Avg tile: 5.5k (gzipped: 4.1k) using weighted average based on OSM traffic
0:01:04 DEB [archive] -     # tiles: 4,115,030
0:01:04 DEB [archive] -  # features: 5,779,817
0:01:04 INF [archive] - Finished in 19s cpu:1m11s avg:3.7
0:01:04 INF [archive] -   read    1x(3% 0.6s wait:18s done:1s)
0:01:04 INF [archive] -   encode  4x(55% 10s wait:2s done:1s)
0:01:04 INF [archive] -   write   1x(19% 4s wait:14s)
0:01:04 INF [archive] - Finished in 1m5s cpu:3m25s gc:1s avg:3.2
0:01:04 INF [archive] - FINISHED!
0:01:04 INF [archive] - 
0:01:04 INF [archive] - ----------------------------------------
0:01:04 INF [archive] - data errors:
0:01:04 INF [archive] - 	render_snap_fix_input	16,800
0:01:04 INF [archive] - 	osm_multipolygon_missing_way	377
0:01:04 INF [archive] - 	osm_boundary_missing_way	55
0:01:04 INF [archive] - 	merge_snap_fix_input	9
0:01:04 INF [archive] - 	osm_multipolygon_duplicate_member	4
0:01:04 INF [archive] - 	omt_fix_water_before_ne_intersect	2
0:01:04 INF [archive] - 	feature_polygon_osm_invalid_multipolygon_empty_after_fix	2
0:01:04 INF [archive] - 	render_snap_fix_input2	1
0:01:04 INF [archive] - 	omt_park_area_osm_invalid_multipolygon_empty_after_fix	1
0:01:04 INF [archive] - ----------------------------------------
0:01:04 INF [archive] - 	overall          1m5s cpu:3m25s gc:1s avg:3.2
0:01:04 INF [archive] - 	lake_centerlines 2s cpu:5s avg:2.3
0:01:04 INF [archive] - 	  read     1x(23% 0.5s done:2s)
0:01:04 INF [archive] - 	  process  4x(0% 0s done:2s)
0:01:04 INF [archive] - 	  write    1x(0% 0s done:2s)
0:01:04 INF [archive] - 	water_polygons   16s cpu:40s avg:2.5
0:01:04 INF [archive] - 	  read     1x(43% 7s done:8s)
0:01:04 INF [archive] - 	  process  4x(21% 3s wait:5s done:6s)
0:01:04 INF [archive] - 	  write    1x(3% 0.4s wait:10s done:6s)
0:01:04 INF [archive] - 	natural_earth    6s cpu:13s avg:2
0:01:04 INF [archive] - 	  read     1x(95% 6s)
0:01:04 INF [archive] - 	  process  4x(13% 0.8s wait:6s)
0:01:04 INF [archive] - 	  write    1x(0% 0s wait:6s)
0:01:04 INF [archive] - 	osm_pass1        2s cpu:8s avg:3.3
0:01:04 INF [archive] - 	  read     1x(2% 0s wait:2s)
0:01:04 INF [archive] - 	  parse    4x(35% 0.8s wait:1s)
0:01:04 INF [archive] - 	  process  1x(68% 2s)
0:01:04 INF [archive] - 	osm_pass2        16s cpu:1m3s avg:4
0:01:04 INF [archive] - 	  read     1x(0% 0s wait:10s done:6s)
0:01:04 INF [archive] - 	  process  4x(68% 11s)
0:01:04 INF [archive] - 	  write    1x(3% 0.5s wait:15s)
0:01:04 INF [archive] - 	ne_lakes         0s cpu:0s avg:12.4
0:01:04 INF [archive] - 	boundaries       0s cpu:0s avg:0
0:01:04 INF [archive] - 	agg_stop         0s cpu:0s avg:0
0:01:04 INF [archive] - 	sort             1s cpu:4s avg:2.5
0:01:04 INF [archive] - 	  worker  1x(52% 0.8s)
0:01:04 INF [archive] - 	archive          19s cpu:1m11s avg:3.7
0:01:04 INF [archive] - 	  read    1x(3% 0.6s wait:18s done:1s)
0:01:04 INF [archive] - 	  encode  4x(55% 10s wait:2s done:1s)
0:01:04 INF [archive] - 	  write   1x(19% 4s wait:14s)
0:01:04 INF [archive] - ----------------------------------------
0:01:04 INF [archive] - 	archive	109MB
0:01:04 INF [archive] - 	features	298MB
-rw-r--r-- 1 runner runner 87M May 29 08:42 run.jar

Full logs: https://github.com/onthegomap/planetiler/actions/runs/26627315329

- BLOCKER (java:S2095): use try-with-resources for ExecutorService
- CODE_SMELL (java:S6885): replace Math.max(1, Math.min(...)) with
  explicit if-dispatch + plain Math.min, so the sequential vs parallel
  branch is easier to read

No behavior change.
@sonarqubecloud

Copy link
Copy Markdown

@msbarry

msbarry commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

Thanks for taking the time to make this change! Would it be possible to share full logs from the planetiler run before and after this change with your input data? In general planetiler tries to keep parallelism minimal so that num threads spreads works across all available cores instead of just loading it up with threads, so I want to see exactly what is limiting the parallelism before deviating from that goal.

@yuiseki

yuiseki commented Jun 3, 2026

Copy link
Copy Markdown
Author

Thanks for the thoughtful question, and the point about keeping parallelism minimal makes sense to me. I put together a fully reproducible benchmark on open data so you can see exactly what limits the parallelism without needing my private input.

What limits the parallelism

Each source stage is fed by a single shapefile reader thread (shapefiles are not splittable, so reading a source is inherently single-threaded). When a schema has many small sources whose per-feature work is light, that single reader cannot produce features fast enough to keep process_threads busy, so most of the pool sits idle for that stage. Because sources run one after another, the entire source-reading phase stays at low utilization no matter how high process_threads is.

In the run below, the 140 source stages average 3.5 active threads out of 31, and the source-reading phase as a whole runs at about 4.7 threads (15% of 31 cores). The total CPU time is essentially identical across N=1/4/8 (~12m), so this is not extra work, it is the same work spread over idle cores. --source_parallelism overlaps several single-threaded readers so the pool fills up.

Reproducible open-data benchmark

7 Geofabrik free shapefile extracts, 20 layers each = 140 shapefile sources, 57,206,689 features, z11-14. 32-core host, JDK, NVMe, warm page cache.

Regions: ireland-and-northern-ireland, connecticut, iceland, new-hampshire, rhode-island, luxembourg, vermont (all from https://download.geofabrik.de). Geometries normalized with ogr2ogr so geotools accepts a handful of degenerate OSM polygons. Schema, download script, and the full before/after logs are in this gist: https://gist.github.com/yuiseki/8679a16d3dc946a13d13408687cec900

java -jar planetiler.jar generate-custom --schema=schema.yml --output=out.pmtiles [--source_parallelism=N]
--source_parallelism total wall source-read phase overall avg threads (of 31) speedup
1 (default) 1m45s 94s 6.9 1.00x
4 34s 23s 21.4 3.06x
8 42s 31s 18.1 2.49x

sort (~4s) and archive (~6s) are unchanged between runs since they are already parallel, so the whole difference is in the source-reading phase. N=4 is the sweet spot here and N=8 regresses, which is why the flag is opt-in with a default of 1 rather than auto-tuned.

Output is byte-identical across all three (md5 27c5acd626bcd30e90676c0838fe298e, same 57,206,689 features), so nothing about the result changes, only how long it takes to produce.

On the cosmetic caveat

As noted in the PR description, per-stage CPU numbers in the logs get mixed once stages overlap, because Timers.currentStage assumes LIFO nesting. The per-stage cpu:/avg: fields under N>1 are therefore inflated and should be ignored; total wall, the overall summary line, output archive, and progress logs are all correct. Happy to fix that in a separate PR if you want it bundled.

Full before/after logs, schema, and the download script are all in the gist: https://gist.github.com/yuiseki/8679a16d3dc946a13d13408687cec900

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants