Artifact channels and shared workspace protocol for multi-agent handoffs #148
Replies: 1 comment
-
|
+1 on explicit artifacts as refs, not payloads. one adjustment after current main: i’d avoid introducing so artifact producer metadata could probably be: producer: {
runId: string;
operationId?: string;
taskId?: string;
session?: string;
parentSession?: string;
}or whatever the exact runtime names settle on. the important part imo is that artifact publishing joins to the operation that produced it, not that Flue invents another generic “work” id. otherwise task telemetry, artifact channels, run logs, and OpenAPI/admin inspection may all end up with slightly different identity systems. also agree publishing should be explicit. writing a file should not automatically make it an artifact. the publish step is the moment the agent says “this output matters.” |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Artifact Channels and Shared Workspace Protocol
Status: Draft proposal
What Hurts
Agents often need to hand off big things: source files, diffs, logs, research, screenshots, reports. Putting those things into prompts is expensive and noisy.
Writing files is better, but then the next agent has to guess which file matters, who made it, whether it replaced an older file, and whether it is meant as a final output or just scratch work.
That is the problem. A shared workspace saves tokens, but without a protocol it turns into a pile of files.
What Other Harnesses Usually Do
What Flue Can Do Better
Flue already owns the sandbox filesystem. The first-principles move is simple: keep file bodies in the workspace, and publish small artifact refs that say what the file is, which work produced it, which channel it belongs to, and what it replaces.
The ref moves through prompts, events, and task results. The large file stays in the sandbox until a model actually needs to read it. The primitive is not a global file database. It is a publishable output attached to a work record.
What Stays Pluggable
The core protocol stays simple: append-only artifact records on top of the existing
SessionEnvfilesystem. That gives Flue a portable default, while keeping room for adapters:.flue-runtime/for the default implementation;flue inspectrenderers;The runtime owns artifact ids and who-made-this metadata. Blob storage, indexing, visualization, and governance stay replaceable.
Shared Primitive: Work Records
Artifact channels and task telemetry should meet at one tiny runtime primitive: a work record.
A task is a work unit. A model call is a work unit. A tool call can be a work unit. An artifact is an output of a work unit. Usage and artifacts are therefore facets of the same causality chain, not two separate reporting systems.
The primitive stack stays boring on purpose:
FlueWorkRefgives identity and causality, task telemetry adds the usage facet, artifact channels add the output facet, and CLI/SSE/tracing/FinOps adapters consume the same normalized facts.FlueWorkRefPromptUsageArtifactRefFlueEventMetadataTaskToolResultDetailsFor this proposal, every published artifact should record the
workIdthat produced it. In v1 that should usually be the enclosing task, prompt, or skill work id, not a separate id for theartifact_publishtool call. That lets Flue answer higher-level questions without guessing: what did this task cost, what did it produce, did the failed work still publish something useful, and which artifact revision changed the cost curve?Primitive Invariants
ArtifactRef.ididentifies the artifact record.producer.workIdidentifies the work that produced the output.producer.workIdshould usually be the enclosing task, prompt, or skill work id. Do not mint a publish-specific work id unless Flue later decides to measure artifact publishing as its own operation.taskIdremains useful for task-centric filters, butworkIdis the cross-feature join key for telemetry, artifacts, traces, TokenOps, and FinOps.replaces; they do not mutate the old record or reuse its artifact id.Goals
flue run, and future inspection tools.SessionEnvfilesystem surface.Design Principles
SessionEnvfilesystem so it works across virtual, local, and remote sandboxes.Non-Goals
Those are reasonable follow-ups, but the first step should be a small protocol that makes shared filesystem handoffs explicit.
What Changes From Today
ArtifactRefids and summaries while file bodies stay in the workspace.workId.replaces, keeping old records available for audit and replay.producer.workId, enabling cost-per-output later.Vocabulary
Workspace is the sandbox filesystem visible to the agent runtime.
Artifact is a meaningful output file or directory that an agent wants other agents or users to find.
Channel is a named stream of artifacts for a purpose, such as
analysis,design,patch,verification, orhandoff.Artifact record is the structured metadata Flue writes when an artifact is published.
Artifact ref is the compact pointer that can safely travel through prompts, events, logs, and task results without copying the underlying file content.
Default Channels
Channels should be freeform strings, but Flue can document a small common vocabulary so logs and examples converge:
analysisdesignpatchverificationhandoffCustom channels should use simple lowercase names with dashes, for example
security-reviewormigration-plan.Protocol Shape
Publishing an artifact should be a metadata operation over the existing filesystem. The producer writes the actual file first, then publishes a record that points at it.
The model-facing version should be a built-in tool with the same split:
writeoreditto create the file;artifact_publishwith the path and short summary;The next task receives a small pointer:
It can call
artifact_listto resolve the id to a path, then use the existingreadtool to inspect the file.If the caller already knows the artifact id,
artifact_getshould resolve it directly.artifact_listis for discovery by channel or producer.Example Workflow
A parent agent coordinating implementation can pass artifact references through the whole workflow without repeatedly copying large context.
designartifactart_design.art_design, reads the referenced path, writes a diff, and publishespatchartifactart_patch.art_designandart_patch, reads only those files, and publishesverificationartifactart_review.art_patchandart_review.The prompt traffic carries short ids and summaries. The large design, patch, and review bodies remain in the workspace unless the next model turn actually needs to read them.
Storage Layout
The v1 protocol should avoid one shared manifest file because
SessionEnvdoes not expose locks or compare-and-swap writes. Instead, each publish writes one unique record file.Default runtime directory:
The runtime directory intentionally uses
.flue-runtime/, not.flue/, so it does not collide with Flue's source layout.Most artifacts will point at files the agent already wrote elsewhere in the workspace. The
files/directory is only for convenience APIs that ask Flue to write managed content directly.Listing artifacts scans
records/and filters by channel, producer, status, or time. A materialized manifest can be derived later without changing the record format.Validation and Safety
Publishing should validate the target before writing the artifact record:
SessionEnv.resolvePath();stat()when available;titleandsummarybounded so events and task details stay small.The protocol does not add a permissions boundary inside a shared sandbox. Agents that share a sandbox can already read and write the same files. Artifact records make that activity discoverable; they do not make untrusted agents safe to run in the same workspace.
Artifact summaries should be treated like log/event data. They should not contain secrets, full file bodies, or large excerpts. The record points at the file; it does not replace the file.
Failure Semantics
Publishing should fail before writing a record when the target path is missing or the sandbox rejects access to it. A failed publish should not create a partial record.
If the artifact file exists but disappears later,
artifact_getshould still return the record and mark the target as unavailable when it checks the path. That preserves provenance while making the broken reference explicit.If two tasks publish records at the same time, both should succeed because their record files use unique ids. If both records claim to replace the same prior artifact, consumers should treat the result as competing revisions rather than attempting last-writer-wins conflict resolution.
Primary Data Types
SDK Surface
The smallest useful trusted-code surface is:
FlueAgent.artifactsandFlueSession.artifactsshare the same workspace, but the session variant automatically fills producer metadata from the active session.Model-Facing Tools
Agents should not need to hand-edit JSON records. Flue can add three built-in tools in v1:
There is no separate
artifact_readin the initial design. Returning paths fromartifact_listkeeps reading on the existingreadtool, which preserves the current truncation and offset behavior.Task Integration
Artifact channels become most useful when paired with
session.task()and the built-intasktool.Child tasks should publish meaningful outputs while they work. The parent should then receive artifact ids in the task result details.
Task events can carry artifact summaries without embedding file contents:
This complements task telemetry. Telemetry explains what the task cost and how long it ran; artifact channels explain what the task produced.
The shared
workIdis what makes that pairing reliable. Without it, consumers have to infer relationships from session ids, task ids, timestamps, or paths. With it, a TokenOps or FinOps adapter can jointask_end.usagetoartifact_publish.artifact.producer.workIddirectly.Together, the two features give a parent workflow an accounting pair:
That pairing is what later enables cost-per-artifact, failed-work analysis, and managed-agent debugging.
Example sequence:
If task telemetry is not installed yet, artifacts still carry producer identity. The same
workIdbecomes the join point once usage rollups arrive.Revision Semantics
Artifact records should be append-only. A revision publishes a new record with
replacespointing to the prior artifact id.Consumers that ask for active artifacts can hide superseded records by default, but the old records remain available for audit, debugging, and replay.
The v1 protocol should not attempt to detect simultaneous edits to the same underlying path. Instead, Flue should encourage task-scoped output paths:
That convention avoids most races without requiring filesystem locks.
CLI Rendering
flue runshould render artifact publications concisely:For a completed task, the CLI can add a compact artifact count:
The CLI should not print artifact contents. It should print paths and ids so users can inspect files directly.
TokenOps and FinOps
Artifact channels create a measurable token-avoidance layer.
For TokenOps, Flue can distinguish:
For FinOps, artifacts help attribute cost to durable outputs:
The protocol should not try to compute token savings in v1. It should preserve enough structure for later rollups to compare large file sizes, artifact references, and task usage.
Implementation Shape
Likely change points:
packages/sdk/src/types.tsFlueWorkRefidentity shape.artifactstoFlueAgentandFlueSession.artifact_publishtoFlueEvent.packages/sdk/src/artifacts.ts.flue-runtime/artifacts/records.packages/sdk/src/agent-client.tsSessionEnv.packages/sdk/src/session.tsworkIdandparentWorkIdto each artifact producer.packages/sdk/src/agent.tsartifact_publish,artifact_get, andartifact_listtools.packages/cli/bin/flue.tsLanding Order
The two PRs should not race to create different contracts.
FlueWorkRefandproducer.workIdon artifact records. Usage joins can wait until task telemetry exists.FlueWorkRef, event metadata, and task details withworkId. Theartifactsfield can wait until artifact channels exist.FlueWorkRefshould be defined once in the SDK types module and imported by both feature implementations.Forward Compatibility and Cost
This should be additive to the current filesystem model. Existing
SessionEnvread/write behavior stays unchanged. A file becomes an artifact only when a producer publishes a record for it, so existing agents do not suddenly create indexes, emit new outputs, or change file visibility.The cognitive cost should stay bounded: users learn
artifact,channel, andrefonly when they need multi-agent handoff. Simple agents can keep writing and reading files as they do today. The protocol makes the important files discoverable without making every file a workflow object.The runtime cost is also bounded:
stat()per publish;limitand filters to keep normal use cheap;The main forward risk is record growth. That is why v1 uses append-only records plus query limits and leaves room for a derived manifest or external index later. The protocol should not require locks, a database, automatic indexing of every write, or a new permission model inside one trusted shared workspace.
Acceptance Criteria
A v1 implementation is ready when:
flue runrenders artifact publication events without printing file contents;workIdas well as by channel, task, session, and status;Suggested Rollout
FlueWorkRef, then add the artifact data types and trusted-codeFlueArtifactshelper.artifact_publishevents and CLI rendering.artifact_publish,artifact_get, andartifact_listtools.Recommended V1 Defaults
tasktool is available. Artifact handoff is part of the managed-agent story, not a niche extension.<cwd>/.flue-runtime/artifacts/records/for all sandbox modes. Users can override the runtime directory later if real deployments need it.ArtifactRefobjects in task result details, but keep summaries bounded and omit file contents.workIdon the artifact. Avoid publish-specific work ids in v1 unless measuring the publish operation itself becomes important.sizeBytesin v1. Treat SHA-256 digests as best-effort for regular files and skip them for directories until there is demand.writecall.Open Questions
.flue-runtime/artifacts/files/<artifact-id>/, or should v1 only publish existing paths?Original implementation from #110 by @ketankhairnar
Beta Was this translation helpful? Give feedback.
All reactions