feat(orchestra): typed ArtifactManifest + flat per-flow run-root bundle#3343
feat(orchestra): typed ArtifactManifest + flat per-flow run-root bundle#3343proksh wants to merge 39 commits into
Conversation
…tent; drop unused imports
…est, consistent mkdir
…format Extract the per-type debug-message handling into printFlowError (a single when over the MaestroException subtypes) so runSingle reads top-to-bottom, and restore the original multi-line formatting of aiOutput/updatedEnv that this branch had collapsed for no reason. Behavior unchanged.
- Remove the cli 'no clobber' test: with different flow names each lands in its own folder trivially; same-name clobber is already pinned by the numeric-suffix disambiguation test. - Merge the two null-artifactsDir ArtifactsGenerator tests into one that asserts both no files on disk and an empty manifest.
4513cc9 to
0655561
Compare
… Schema
Agents (and humans) reading a run's manifest.json had no in-band way to
learn what each field or ArtifactKind means. Make the manifest
self-describing:
- Add a hand-written JSON Schema (manifest.schema.json) describing
ArtifactManifest, with a description on every field and per-ArtifactKind
docs via the oneOf/const pattern. Lifted from the model's KDoc.
- ArtifactsGenerator.onFlowEnd now bundles the schema next to manifest.json
in each run dir and writes a leading `$schema` pointing at it, so the
manifest resolves its own schema offline.
- Centralize both writes in TestOutputWriter (saveManifest injects $schema;
saveManifestSchema copies the classpath resource).
Drift guard: ArtifactManifestSchemaTest fails the build if any
ArtifactKind/ArtifactFormat value is missing from the schema, so the
hand-written doc can't silently fall out of sync with the enums.
The on-disk manifest now carries `$schema`, an unknown property to the
typed model. No production code deserializes the manifest; the one test
that did used a strict mapper and now decodes tolerantly, matching the
model's documented contract (tolerance is the reader's choice).
…/ and recordings/
…dings nest under it
…utput-dir holds the full run bundle
Resolve the per-flow folder upfront and pass it as Orchestra's single artifactsDir; Orchestra writes the bundle + screenshots/ + recordings/ + manifest straight there. Drops the temp staging dir, copyBundleToFlowDir, the separate screenshotsDir param, mediaRoot, and the now-vestigial testOutputDir plumbing. Continuous mode keeps no bundle (artifactsDir null -> takeScreenshot to CWD).
…shots + full recording (#3348) feat(orchestra): nest bundle under artifacts/, flag-gated step shots + full recording Restructure the per-run artifact bundle so everything core writes lives under an artifacts/ folder (zipped as one unit), while the two separately-served outputs — per-step screenshots/ and the full-run screen-recording.mp4 — sit at the run root. - Rename takeScreenshot output screenshots/ -> artifacts/takeScreenshot/ and startRecording output recordings/ -> artifacts/startRecording/ so folders match the command that writes them. - Move commands.json, maestro.log (now under logs/), and the failure screenshot under artifacts/. - Add Orchestra flags captureStepScreenshots / captureScreenRecording (default off, so the local CLI bundle is unchanged; the worker turns them on). When on, ArtifactsGenerator writes screenshots/step-<seq>.png after each non-failed command and records the whole run to screen-recording.mp4. - Manifest entries use the new relative paths and carry metadata.source to tell same-kind entries apart (failure/take_screenshot/step, start_recording/full_run). - Update assertScreenshot reference lookup and the schema prose accordingly.
…#3349) Replace the per-run bundled manifest.schema.json with a stable identity: each manifest.json now sets $schema to the hand-written schema served from this repo's main branch via GitHub raw — https://raw.githubusercontent.com/mobile-dev-inc/Maestro/main/maestro-orchestra-models/src/main/resources/maestro/orchestra/manifest.schema.json A fixed branch keeps the URL constant while its content tracks the latest schema, which is safe because the model tolerates unknown fields and a test blocks undocumented artifact kinds. No extra hosting infrastructure is needed, and the manifest stays self-describing even after it is moved away from its run folder. - Drop saveManifestSchema and the per-run schema-file copy; saveManifest writes the URL directly. - Set the schema's own $id to the same URL. - Keep the in-repo schema resource as the source of truth (it is the file the URL serves) and for the schema-coverage test. Offline resolution is intentionally dropped.
…r' into feat/orchestra-artifact-manifest # Conflicts: # maestro-cli/src/main/java/maestro/cli/report/TestDebugReporter.kt # maestro-cli/src/main/java/maestro/cli/runner/TestSuiteInteractor.kt # maestro-orchestra/src/main/java/maestro/orchestra/Orchestra.kt # maestro-orchestra/src/main/kotlin/maestro/orchestra/debug/ArtifactsGenerator.kt
…st schema as v1 Drop the intermediate artifacts/ folder — the run root is now itself the zippable bundle, so commands.json, logs/ (maestro.log + device logs + crash/ANR), takeScreenshot/, startRecording/, the failure screenshot, screenshots/, and screen-recording.mp4 all sit directly under it next to manifest.json. Version the hand-written schema as manifest.v1.schema.json (file, $id, the $schema URL embedded in every manifest, and the classpath resource), so a future structural change ships as manifest.vN beside it while the v1 URL keeps resolving for manifests already in the wild.
… onCommandArtifact hook
…a onCommandArtifact
… + per-command artifacts
…edicated manifest kinds
…rchy in commands.json
|
@proksh I worked with Claude to generate this. I'll attach the conversation so you can see how I got there. Please note that this is a first pass conversation, but it immediately picked up on the architectural problems I noticed in this PR on my first scan. If core is going to own the artifacts, it needs to really own them, and not just find them. The "artifact policy" idea is interesting, but it might not be necessary. If we are only in two modes: record everything vs record a lighter subset, that might be better as a single boolean flag. Artifact manifest: record, don't scan The RFC call is right — one typed manifest owned by Orchestra, identical local and cloud. The implementation undermines it: it derives the manifest by scanning the run-root folders at flow end instead of recording artifacts as they're produced. Most of the issues below follow from that. Problems
Fix: a single collector that owns path allocation and the record. Producers go through it; the manifest is its records, the per-command list the same records grouped by command. Two verbs: Layering (open decision): core manifest sealed/immutable at flow end; backend wraps it in a run envelope (run id, signed URLs, post-hoc cloud artifacts) that's a superset of the same entry type. Avoids a mutable manifest and the "when is it final" ambiguity. PR impact: keep the model types, schema, driver seam, and flat layout. Replace the disk scan with collector accumulation; route every writer through Chat: |
Stacked on #3282. Makes Maestro core own a typed, self-documenting artifact manifest, and restructures the CLI's debug output into one per-flow folder per run — so local and cloud emit the same shape and every consumer reads one model instead of re-deriving artifact types from a folder of files.
What this adds
1. Shared model (
maestro-orchestra-models) —ArtifactManifest,ArtifactEntry,ArtifactKind,ArtifactFormat, andArtifactFiles(canonical relative paths). Plain data classes, no serialization annotations — tolerance is the consumer's choice.2. Self-documenting
manifest.json— it carries a$schemapointing at a stable, versioned URL: the hand-written schema served straight from this repo'smainbranch via GitHub raw (…/mobile-dev-inc/Maestro/main/maestro-orchestra-models/.../manifest.v1.schema.json). The filename carries the schema's major version (schemaVersion), so a future structural change ships asmanifest.v2.schema.jsonbeside it and the v1 URL keeps resolving for every manifest already in the wild. The URL is the schema's permanent identity, so the manifest stays self-describing even after it's copied or uploaded away from its run folder — with no per-run schema file to bundle and no hosting to stand up (it's just the in-repo file). Within a version, pointing atmainkeeps the URL constant while its content tracks additive changes; that's safe because the model ignores unknown fields and a build test fails if anyArtifactKind/ArtifactFormatis missing from the schema, so docs can't drift from code. Trade-off: resolving it needs network — the offline-bundled copy was intentionally dropped. Note: the URL 404s until this work lands onmain.3. Core produces the bundle (
maestro-orchestra) —ArtifactsGenerator(an internalOrchestraListener) writes the per-flow bundle intoartifactsDirand builds the manifest, exposed onOrchestra.FlowResult.artifactManifest. The run root itself is the zippable bundle — everything core makes sits directly under it, with no intermediateartifacts/folder:commands.json(per-command metadata) andlogs/maestro.logat the run root. Serialized errors are slim —message+debugMessageonly; hierarchies and stack traces live in their own artifacts.metadata.artifacts— each command entry incommands.jsonlists{type, path}objects (run-root-relative;typeis anArtifactKindname), so consumers find per-command artifacts without depending on the folder layout. The manifest stays folder-level — no per-file entries. Empty lists are omitted. Plumbing: a newOrchestraListener.onCommandArtifact(kind, relativePath)hook, dispatched only when a bundle is being produced.TAKE_SCREENSHOT(takeScreenshot output →takeScreenshot/),START_SCREEN_RECORDING(startRecording output →startRecording/),SCREENSHOT(step screenshots →screenshots/step-<N>.png),SCREEN_RECORDING(full-run recording),SCREEN_HIERARCHY(per-step hierarchy JSON). Nometadata.sourcetags on these — the kind says it;sourceremains only where it carries real information (e.g.DEVICE_LOG).screen-hierarchy/step-<N>.jsonand references it from itsartifacts; the giant inlinehierarchyblob is gone fromcommands.json(demo run: 246 KB → 4 KB).captureStepScreenshotson (worker), every executed command getsscreenshots/step-<N>.png; with it off (local CLI), only the failed command does. No specialscreenshot-❌-*.pngconvention — the failure shot is the FAILED command'sSCREENSHOTartifact incommands.json.captureStepScreenshots(above) andcaptureScreenRecording→screen-recording.mp4.--continuous), takeScreenshot/startRecording write relative to CWD, as before.4. CLI writes one folder per flow (
maestro-cli) — each flow's output folder is resolved upfront and passed as the singleartifactsDir; Orchestra writes the whole bundle straight into it. Single and suite runs both produce<output>/<ts>/<flow>/(with-shard-N/-Nsuffixes for shards / name collisions). Replaces the old flat session layout and its temp-staging + copy step.Layout
The manifest uses paths relative to its own folder, so local and cloud manifests look the same.
Behavior change to flag
--test-output-dir, when set, is this artifact folder — it now holds the full bundle (incl.manifest.json), not just screenshots. Its help text is updated. The default location (~/.maestro/tests/<ts>/) is unchanged.How this was tested
sourcetags; folders register as collections with a file count; step screenshots/recording appear per their flags; hierarchy files are written per executed command (not for skipped), attributed, andcommands.jsonhas no inlinehierarchy; typedartifactsattribute to the right command (incl. nested/composite dedup) and the key is omitted when empty; serialized errors carry onlymessage+debugMessage;manifest.jsoncarries the stable, versioned$schemaURL and no schema file is bundled.takeScreenshotentry carries{"type": "TAKE_SCREENSHOT", "path": "takeScreenshot/checkout.png"}, every executed command references itsscreen-hierarchy/step-N.json, the failed command referencesscreenshots/step-4.png, all paths resolve, andcommands.jsonshrank 246 KB → 4 KB.