feat(engine/sink)!: require relative paths in sink output#2123
Open
bnimit wants to merge 22 commits into
Open
Conversation
Pre-implementation TDD: write the test suite for the new from_path semantics (relative-only input joined against ctx.sandbox_root). All 16 new tests fail against the current chokepoint that still accepts absolute URIs. Task 3 will rewrite from_path to make them pass.
BREAKING: SinkOutput::from_path now accepts only relative paths and joins them internally against ctx.sandbox_root. Absolute URIs (any 'scheme://...' form), leading '/' / '~', empty strings, whitespace padding, literal '.' / '..', and paths that resolve to the artifact directory itself are rejected with explicit error messages. The URI-scheme rejection mentions workerArtifactPath by name so customers upgrading can find the migration path from logs. FeatureWriter is also updated to use SinkOutput::from_path so it inherits the new validation chain (it previously used Uri::from_str directly which would silently join relative paths against CWD). gs:// and ram:// root tests use a lenient match: if the default StorageResolver lacks those backends the test accepts a "failed to resolve storage" error (storage step), but any earlier validation or join failure is still a test failure. join_with_dotdot_now_rejected_by_sandbox is updated to use '../..' (two levels) so the traversal genuinely escapes sandbox_root. This commit alone fails integration tests because existing fixtures produce absolute URIs. The next commit migrates the 86 fixtures. Part of #2117
The .local-specs/ directory holds local-only design + planning artifacts that should not be tracked in git.
Strip the Url(env["workerArtifactPath"]) / prefix from sink output
expressions across all affected fixture files. Also strip the equivalent
file::join_path(env.get("workerArtifactPath"), x) wrapper, including
single-line complex expressions, multi-line let-variable patterns, and
YAML single-quoted value: forms. Literal-string outputs are promoted
from flowExpr type to string type where applicable. Empty
'with: { workerArtifactPath: null }' and bare 'workerArtifactPath:'
workflow declarations are removed.
Also update test harnesses to use Runner::run_with_sandbox_root so
that relative sink paths resolve against the tempdir rather than
the read-only filesystem root (file:///):
- engine/runtime/tests/helper.rs
- engine/runtime/tests/logging_test/logging_helper.rs
- engine/runtime/tests/testcase/geometry/neighbor_finder.rs
- engine/testing/workflow-tests/src/main.rs
Fix the logging/08 fixture's expected line number after the
workerArtifactPath: null removal shifted the YAML line count by one.
Fix tests/fixture/workflow/file/writer/geopackage.yaml: hardcoded
ram:///output_test.gpkg absolute URI → relative output_test.gpkg.
Functionally equivalent — the engine joins the relative path against
ctx.sandbox_root, which the worker/CLI already populates from the same
workerArtifactPath value.
Part of #2117.
- relative_output_writes_under_sandbox_root: confirms a workflow with
output:{ type:string, value:'foo' } writes the file at
<sandbox_root>/foo end-to-end.
- absolute_output_fails_with_migration_hint: confirms a workflow using
the legacy Url(env["workerArtifactPath"])/x pattern fails at the
chokepoint with an error that names workerArtifactPath (the search
anchor customers will hit in logs).
Part of #2117.
Reflect the new strict-relative API: workflow authors provide a relative path; the engine joins against ctx.sandbox_root. The same YAML is portable across storage backends (file, gs, ram, s3) because the sandbox_root scheme is decided by the worker/CLI, not the workflow. Part of #2117.
Documents the breaking change in sink output parameters and the
migration path from the legacy Url(env["workerArtifactPath"])/x and
file::join_path(env.get("workerArtifactPath"), x) forms.
Part of #2117.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a breaking change to sink output handling: sink outputs must now be strict-relative paths that the engine joins against the per-job sandbox_root, improving workflow ergonomics and keeping sandbox enforcement centralized in SinkOutput::from_path.
Changes:
- Reworked
SinkOutput::from_pathto validate strict-relative paths, join againstctx.sandbox_root, and sandbox-check viaensure_under. - Updated
FeatureWriterand multiple test harnesses/fixtures to use relative outputs and to run with an explicitsandbox_root. - Added/updated integration fixtures and changelog/docs to reflect the migration away from
workerArtifactPath-prefixed absolute URIs.
Reviewed changes
Copilot reviewed 100 out of 105 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| engine/runtime/action-sink/src/output.rs | Implements strict-relative sink output validation + join against sandbox_root, with expanded unit tests. |
| engine/runtime/action-processor/src/feature/writer.rs | Routes FeatureWriter buffering through SinkOutput::from_path and re-validates sandbox before writes. |
| engine/runtime/tests/helper.rs | Updates test runner to use run_with_sandbox_root and adds execute_expect_err. |
| engine/runtime/tests/logging_test/logging_helper.rs | Runs logging fixtures with an explicit sandbox_root. |
| engine/runtime/tests/testcase/geometry/neighbor_finder.rs | Switches to relative output + runs workflow with sandbox root. |
| engine/testing/workflow-tests/src/main.rs | Switches workflow-tests harness to run_with_sandbox_root and constructs sandbox_root from tempdir. |
| engine/runtime/tests/testcase/file/writer/relative_output.rs | Adds integration tests for relative output success and absolute output failure. |
| engine/runtime/tests/testcase/file/writer.rs | Registers the new relative_output testcase module. |
| engine/runtime/tests/fixture/workflow/file/writer/relative_output.yaml | New fixture using a relative sink output. |
| engine/runtime/tests/fixture/workflow/file/writer/absolute_output_fails.yaml | New fixture that should fail due to absolute output URI. |
| engine/runtime/tests/fixture/workflow/file/writer/geopackage.yaml | Migrates GeoPackageWriter output to a relative path. |
| engine/runtime/tests/fixture/workflow/file/writer/czml_timeseries.yaml | Migrates output expression to no longer depend on workerArtifactPath. |
| engine/runtime/tests/fixture/workflow/file/reader/czml_timeseries.yaml | Migrates FeatureWriter output expression to no longer depend on workerArtifactPath. |
| engine/runtime/tests/fixture/workflow/obj-reader.yml | Migrates FeatureWriter output (currently incorrectly set to a non-literal expression; see comments). |
| engine/runtime/tests/fixture/workflow/obj-reader-detailed.yml | Migrates FeatureWriter outputs (currently incorrectly set to non-literal expressions; see comments). |
| engine/runtime/tests/fixture/testdata/logging/01_basic_sequential_processing/workflow.yml | Removes workerArtifactPath input and migrates sink outputs to relative paths. |
| engine/runtime/tests/fixture/testdata/logging/02_parallel_processing/workflow.yml | Same: removes workerArtifactPath input and migrates sink outputs to relative paths. |
| engine/runtime/tests/fixture/testdata/logging/03_duplicate_node_names/workflow.yml | Same migration to relative outputs. |
| engine/runtime/tests/fixture/testdata/logging/04_factory_error/workflow.yml | Same migration to relative outputs. |
| engine/runtime/tests/fixture/testdata/logging/05_source_error/workflow.yml | Same migration to relative outputs. |
| engine/runtime/tests/fixture/testdata/logging/06_processor_error/workflow.yml | Same migration to relative outputs. |
| engine/runtime/tests/fixture/testdata/logging/07_sink_error/workflow.yml | Removes workerArtifactPath input (sink failure fixture). |
| engine/runtime/tests/fixture/testdata/logging/08_workflow_definition_error/workflow.yml | Removes workerArtifactPath input and migrates output to relative. |
| engine/runtime/tests/fixture/testdata/logging/08_workflow_definition_error/user-facing.log | Updates expected log line/column due to fixture edits. |
| engine/runtime/tests/fixture/testdata/logging/09_orphaned_sink/workflow.yml | Removes workerArtifactPath input and migrates sink outputs to relative. |
| engine/runtime/examples/fixture/workflow/solar-radiation/workflow.yml | Migrates many outputs to relative paths and updates Code type usage for sink outputs. |
| engine/runtime/examples/fixture/workflow/solar-radiation/solar-radiation.yml | Same migration to relative sink outputs. |
| engine/runtime/examples/fixture/workflow/solar-radiation/adequate_place_judgement.yml | Same migration to relative sink outputs. |
| engine/runtime/examples/fixture/workflow/examples/citygml3-reader/workflow.yml | Migrates Cesium3DTilesWriter output to type: string relative output. |
| engine/runtime/examples/fixture/workflow/examples/citygml-roundtrip/workflow.yml | Migrates outputs away from workerArtifactPath prefixing. |
| engine/runtime/examples/fixture/workflow/datetime_converter/basic_conversions.yml | Migrates sink outputs to relative paths. |
| engine/runtime/examples/fixture/workflow/datetime_converter/custom_format.yml | Migrates sink outputs to relative paths. |
| engine/runtime/examples/fixture/workflow/datetime_converter/error_handling.yml | Migrates sink outputs to relative paths. |
| engine/runtime/examples/fixture/workflow/executor-testing/feature-write/diamond.yml | Migrates sink outputs to relative paths and removes workerArtifactPath input. |
| engine/runtime/examples/fixture/workflow/executor-testing/feature-write/nested-subgraph.yml | Same migration. |
| engine/runtime/examples/fixture/workflow/executor-testing/feature-write/subgraph.yml | Same migration. |
| engine/runtime/examples/fixture/workflow/executor-testing/feature-write/two-copies.yml | Same migration. |
| engine/runtime/examples/fixture/workflow/quality-check/plateau4/... | Bulk migration of plateau QC fixtures to relative sink outputs and removal of workerArtifactPath inputs. |
| engine/runtime/examples/fixture/workflow/data-convert/plateau4/... | Bulk migration of plateau data-convert fixtures to relative sink outputs and removal of workerArtifactPath inputs. |
| engine/runtime/examples/fixture/graphs/plateau4/quality-check/01-01-common.yml | Updates graph fixture sink outputs to relative paths. |
| engine/runtime/examples/fixture/graphs/plateau4/quality-check/01-02-common.yml | Updates graph fixture sink outputs to relative paths. |
| engine/worker/workflow/cms/plateau4/quality-check/bldg/template/workflow.yml | Migrates CMS template sink output to relative. |
| engine/worker/workflow/cms/plateau4/data-convert/bldg/template/workflow.yml | Migrates CMS template sink outputs/compress outputs to relative. |
| engine/AGENTS.md | Documents new relative-output behavior and rejection rules. |
| CHANGELOG.md | Adds breaking-change notice + migration instructions. |
| engine/Cargo.toml | Bumps workspace version. |
| engine/Cargo.lock | Updates locked workspace crate versions accordingly. |
| .gitignore | Ignores .local-specs/. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Wrap bare FeatureWriter output values in double-quotes so Rhai
evaluates them as string literals rather than property accesses
(obj-reader.yml, obj-reader-detailed.yml).
- Change execute_expect_err to use e.to_string() (Display) instead
of format!("{e:?}") (Debug) for more stable error messages.
- Strengthen relative_output_writes_under_sandbox_root: verify the
output file actually exists under sandbox_root, not just that
execution succeeded.
Part of #2117.
Bump workspace version to 0.0.377 (main caught up to 0.0.376). Re-run cargo make generate-examples-cms-workflow to refresh assembled plateau workflow YAMLs against the merged source pieces. Resolve CHANGELOG conflict by keeping both Unreleased (this PR's breaking change) and 0.1.0-alpha.18 (released by main while PR was open). Part of #2117.
…utput CI failure on PR #2123: cesium3dtiles and mvt sinks were calling Uri::from_str(path) directly on the evaluated sink output expression. After the strict-relative migration, that expression now produces a bare filename like 'bldg_lod1' — Uri::from_str then joins it with CWD (not sandbox_root), producing 'file:///<cwd>/bldg_lod1'. The new SinkOutput::from_path chokepoint correctly rejects this as an absolute URI. Replace the Uri::from_str pre-parse with SinkOutput::from_path so the relative path is validated and joined against ctx.sandbox_root before being stored as the buffer key (same pattern Task 3 applied to FeatureWriter). Add SinkOutput::from_resolved_uri for the pipeline stages (tile_writing_stage, geometry_slicing_stage) that receive an already-resolved buffer key URI and need to acquire a SinkOutput handle without re-running the strict-relative checks. Part of #2117.
CI failure on PR #2123 (run 26798730587): plateau-tiles-test/runner.rs was the only test harness left using Runner::run (the legacy file:/// sentinel). After the strict-relative migration, relative sink paths joined against file:/// resolved to the filesystem root, producing 'PermissionDenied' errors when writing to /bldg_lod1. Mirrors the Task 5 fix applied to helper.rs, logging_helper.rs, neighbor_finder.rs, and workflow-tests/src/main.rs — use the workerArtifactPath value (already computed as flow_dir) as sandbox_root. Part of #2117.
Address Copilot/code-review concern that SinkOutput::from_path was
being called per-feature in three sinks (FeatureWriter, cesium3dtiles,
mvt) only to extract the validated URI as a buffer key. That pattern
forced storage_resolver.resolve() into the hot path for every feature.
Split the chokepoint into two clear phases:
ensure_relative_path(ctx, path) -> Uri
└── validate + join + ensure_under (no Storage acquisition)
SinkOutput::from_path(ctx, path) -> SinkOutput
└── ensure_relative_path + from_resolved_uri (full chokepoint)
The three sinks now call ensure_relative_path per feature (cheap) and
acquire Storage once at flush/write time via from_resolved_uri (which
was added during the cesium/mvt fix in this PR).
The validation behavior is identical — existing from_path tests
exercise the same code path because from_path now just composes the
two phases.
Part of #2117.
KaWaite
reviewed
Jun 2, 2026
| } | ||
| output: | | ||
| file::join_path(env.get("workerArtifactPath"), "summary_common2-" + env.get("__value")["package"] + ".json") | ||
| "summary_common2-" + env.get("__value")["package"] + ".json" |
Member
There was a problem hiding this comment.
I believe if this is FlowExpr, output needs to be updated to have codeType = FlowExpr to work. Would like @asrcpq to confirm though
Restores the chokepoint guarantee weakened by the ensure_relative_path
split. After that refactor, from_resolved_uri was publicly exposed as
a way to construct a SinkOutput from a Uri without validation — a
future contributor could call it with any URI and bypass the sandbox.
Tighten to pub(crate). External callers (outside the action-sink crate)
have exactly two ways to interact with SinkOutput:
- SinkOutput::from_path(ctx, &str): full chokepoint
(validate + acquire + writeable handle)
- ensure_relative_path(ctx, &str): validation only
(returns a Uri, no write capability)
The split-phase optimization (ensure_relative_path then
from_resolved_uri) is now available only to sinks inside action-sink
that legitimately need to skip a second validation in their hot path
(cesium3dtiles, mvt). All current callers of from_resolved_uri are
inside action-sink, so this is a zero-functional-change tightening.
Part of #2117.
…, resolver)
Adopt external reviewer's proposed design: collapse from_path,
from_resolved_uri, ensure_relative_path, and join into a single
SinkOutput::new entrypoint. Validation, joining, and storage
acquisition all happen internally.
Buffering sinks (cesium3dtiles, mvt, FeatureWriter, csv, json, excel,
xml) key their buffers by the relative path String and call
SinkOutput::new once per output file at flush time. Storage resolver
caches keep this fast.
Grouped sinks (geojson, czml) compose sub-paths as strings
("{base}/{hash}.geojson") and call SinkOutput::new for each group.
Shapefile and GLTF pipelines compose tile/sidecar paths as strings.
A lightweight sandbox::ensure_valid_relative_path helper provides
syntactic early validation for buffering sinks (full sandbox check
still occurs at SinkOutput::new flush time).
Bug fixes the reviewer also caught:
- gltf: was calling Uri::from_str on the workflow's relative output
at build time, which silently joined with CWD, producing an absolute
URI that SinkOutput::new would have rejected at runtime. Now stores
the relative path as String and calls SinkOutput::new at finish time.
- citygml texture sidecar: was composing absolute dst URIs from the
parent of the resolved GML output (containing "://"), which
SinkOutput::new rejects. Now computes the texture dst as a relative
path under sandbox_root by stripping the sandbox_root prefix from
the GML output URI, then composing "parent/stem_appearance/file.png".
Net effect: one constructor, one validation site, no API for bypassing
validation. The chokepoint guarantee is restored to its PR1/PR2 shape
— stronger than the pub(crate) tightening it replaces.
Part of #2117.
Reflect the SinkOutput::new(sandbox_root, path, resolver) chokepoint that replaced from_path/from_resolved_uri/ensure_relative_path/join. Note that buffering sinks key by relative-path String and compose nested paths via string formatting, and that sandbox::ensure_valid_relative_path provides optional fail-fast validation at intake. Part of #2117.
Address the remaining piece of the external reviewer's design: drop the manual relative-path-checker that buffering sinks were calling at intake. Validation now happens exclusively inside SinkOutput::new at flush time. Bad paths still get rejected — just one validation site instead of two. Changes: - Delete sandbox::ensure_valid_relative_path - Drop the early-validate calls in cesium3dtiles/sink.rs (2 sites), mvt/sink.rs (2 sites), and FeatureWriter writer.rs (1 site) - Revert sandbox module from pub to crate-private (the public surface is only ensure_under and SandboxError via lib.rs re-exports) - Update engine/AGENTS.md to reflect the single-validation-site design Part of #2117.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Sink
outputparameters now require a relative path. The engine joins it internally against the per-job artifact directory (sandbox_root). Workflow authors no longer need to writeUrl(env["workerArtifactPath"]) / x.This is a breaking change for any workflow author using the old absolute-URI pattern. See the CHANGELOG entry for the migration steps.
Closes #2117.
Follow-up to #2108 (which delivered the security half — the sandbox check).
What changed
SinkOutput::from_pathrewritten to validate + join againstctx.sandbox_root./..,://, leading//~, traversal, resolves-to-root, migration-keyword)Url(env["workerArtifactPath"])/xandfile::join_path(env.get("workerArtifactPath"), x)patternsSinkOutput::from_path(was usingUri::from_strdirectly)Runner::run(unsandboxed) toRunner::run_with_sandbox_root(tempdir())to support relative paths in testsengine/AGENTS.md"Sandbox & sink writes" section updatedMigration for workflow authors
Old:
New (static):
New (dynamic):
Workflows still using the old pattern fail at runtime with an error mentioning
workerArtifactPath— search logs for that keyword.Test plan
cargo make test-rsgreen (945 passed, 0 failed)cargo make test-qcgreen (135 passed, 0 failed)cargo make clippycleancargo make formatno-op (fmt committed)env["workerArtifactPath"]andenv.get("workerArtifactPath")across YAML fixtures — zero sink-output uses remain