symphony: move the elixir runtime into packages/symphony#782
Conversation
The room stack moved to the ix monorepo (symphony#268), leaving indexable-inc/symphony as the Elixir runtime alone; absorb it at c9e7092 so the dedicated repo can retire. The launcher ships as packages.<sys>.symphony, the NixOS module as nixosModules.symphony (the same attr ix imports from the symphony flake today), and the required quality lane runs sandboxed as checks.<sys>.symphony-elixir. The symphony flake input stays pinned as the room-server provider for images/dev/symphony-codex until that seam moves to ix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Blast radius
1 added, 0 removed pie showData title Rebuilt checks by category
"blast" : 1
"eval" : 1
"lint" : 1
"symphony" : 1
flowchart LR
c0["blast-radius-test"]
c1["lint"]
c2["eval"]
changed checks (3)
|
There was a problem hiding this comment.
AI review found issues in this pull request.
Verdict: patch is incorrect
Confidence: 0.89
The new runtime has security and correctness flaws around unauthenticated control paths, worker identity, Slack replay protection, and cancellation semantics that can leave secret-backed automation running or reachable when it should not be.
- P1
packages/symphony/elixir/lib/symphony_elixir/runtime.ex:202Cancel does not stop running work - P1
packages/symphony/elixir/lib/symphony_elixir_web/router.ex:42Manual run APIs are exposed without authentication - P1
packages/symphony/elixir/lib/symphony_elixir_web/channels/worker_socket.ex:36Worker socket falls back to self-asserted identity - P2
packages/symphony/elixir/lib/symphony_elixir_web/controllers/slack_events_controller.ex:121Slack signatures can be replayed indefinitely
| def handle_call({:cancel, actor}, _from, state) do | ||
| Enum.each(Map.keys(state.tasks), &Process.demonitor(&1, [:flush])) | ||
|
|
||
| cancelled = | ||
| Enum.reduce(state.graph.nodes, state.graph, fn {id, node}, acc -> | ||
| if Node.terminal?(node), do: acc, else: transition(acc, id, :cancelled) | ||
| end) | ||
|
|
||
| finished = | ||
| %{cancelled | status: :cancelled} | ||
| |> RunGraph.append_audit(:cancel, nil, actor, %{}) | ||
|
|
||
| persist(finished, state) | ||
| release_placement(state) | ||
| {:stop, :normal, :ok, %{state | graph: finished, tasks: %{}, node_refs: %{}}} |
There was a problem hiding this comment.
Cancel does not stop running work
cancel/2 only demonitores the task refs before marking the graph cancelled and releasing placement. The Task.Supervisor child pids are not stored or terminated, so an in-flight agent turn or exec script keeps running after the operator sees the run as cancelled; any commits, pushes, or filesystem writes then happen invisibly against a released/cancelled run. Store the task pids and explicitly terminate the running tasks/processes before finalizing cancellation.
| scope "/api/v1", SymphonyElixirWeb do | ||
| pipe_through(:api) | ||
|
|
||
| # The manual-trigger producer onto the IR runtime. | ||
| post("/runs", ApiController, :enqueue_run) | ||
|
|
||
| # IR runs (the RunGraph model). | ||
| get("/ir/schema", IRRunController, :schema) | ||
| get("/ir/runs", IRRunController, :index) | ||
| post("/ir/runs", IRRunController, :create) | ||
| get("/ir/runs/:run_id", IRRunController, :show) | ||
| post("/ir/runs/:run_id/cancel", IRRunController, :cancel) | ||
| post("/ir/runs/:run_id/rerun", IRRunController, :rerun) | ||
| post("/ir/runs/:run_id/clear-failed", IRRunController, :clear_failed) | ||
| post("/ir/runs/:run_id/nodes/:node_id/retry", IRRunController, :retry_node) |
There was a problem hiding this comment.
Manual run APIs are exposed without authentication
This public API scope contains manual run creation and operator controls, but it is only piped through :api; the HMAC checks live inside the webhook controllers and do not protect /runs or /ir/runs/*. A deployment that exposes this server for GitHub/Linear/Slack webhooks can also let arbitrary callers start secret-backed workflows with attacker-controlled input, read run details, or cancel/rerun jobs. Split signed webhook routes from operator routes or add a fail-closed authentication plug for the non-webhook API.
| # The mTLS-verified CN nginx forwards is authoritative; the connect param is | ||
| # the dev/test fallback when the socket is not behind mTLS. | ||
| defp worker_id(params, connect_info) do | ||
| header_cn(connect_info) || empty_to_nil(params["worker_id"]) | ||
| end |
There was a problem hiding this comment.
Worker socket falls back to self-asserted identity
The socket documentation says production identity comes from the mTLS-forwarded x-worker-cn, but the implementation accepts the worker_id query parameter whenever that header is absent. If /worker is reachable without the proxy header, any client can register as a worker with chosen labels/address and receive provision payloads containing runtime env and bot tokens. Make the query-param fallback explicit to dev/test or require a signed token/mTLS identity in all production connections.
| timestamp = conn |> Plug.Conn.get_req_header("x-slack-request-timestamp") |> List.first() | ||
| provided = conn |> Plug.Conn.get_req_header("x-slack-signature") |> List.first() | ||
| expected = expected_signature(secret, timestamp, conn.assigns.raw_body) | ||
|
|
||
| cond do | ||
| is_nil(timestamp) or is_nil(provided) -> | ||
| {:error, :unauthorized, "missing Slack signature headers"} | ||
|
|
||
| byte_size(provided) != byte_size(expected) -> | ||
| {:error, :unauthorized, "signature mismatch"} | ||
|
|
||
| not Plug.Crypto.secure_compare(provided, expected) -> | ||
| {:error, :unauthorized, "signature mismatch"} | ||
|
|
||
| true -> | ||
| :ok |
There was a problem hiding this comment.
Slack signatures can be replayed indefinitely
The verifier includes Slack's timestamp in the HMAC but never checks its freshness. A captured valid Slack request can be replayed later with the same timestamp/signature and will enqueue another app-mention run once the original run is no longer active. Parse x-slack-request-timestamp and reject requests outside Slack's replay window before accepting the signature.
Summary
Move the Symphony Elixir runtime into
packages/symphony, absorbed from indexable-inc/symphony atc9e709208c3ae161e24f625b9f3808a288c859ed. After symphony#268 moved the room stack (room-server, the Tauri/Svelte UI) into the ix monorepo, the dedicated repo was the Elixir runtime alone; this PR gives it its long-term home in index so the standalone repo can retire. Follows up on the reverted whole-repo subtree attempt (#767 → #779/#780): this time only the Elixir part moves, per the plan agreed in Slack (room → ix, elixir → index).What this adds
packages/symphony/: the runtime (elixir/), the engine wire fixtures (contracts/fixtures, kept besideelixir/because the contract tests resolve../../contracts), the bundled example pack (workflows/example),bin/run-nix, and docs. The 30 MB of.githubdemo media, the standalone CI workflows, and the PR template did not move; thepr_body.checkmix task that validated that template is deleted as vestigial.packages.<sys>.symphony: the launcher, parity with the standalone flake'spackages.default(Nushell wrapper exec'ingbin/run-nixwith Elixir 1.19/OTP 28, gh, git, openssh, cacert on PATH). Production deploys keep working the same way: stage source,mix deps.get,mix run --no-halt.nixosModules.symphonyviamodules/services/symphony/: the service module, unchanged, auto-discovered under the same attr name ix imports from the symphony flake today.checks.x86_64-linux.symphony-elixir: the standalone repo's required lane (mix compile --warnings-as-errors,mix format --check-formatted,mix credo,mix test; 384 tests) as a sandboxed derivation. Deps come from afetchMixDepsfixed-output derivation; rebar is pinned; the lazy_html C++ NIF (test-only, LiveView HTML assertions) is satisfied by seeding elixir_make's artifact cache with the upstream release tarball, which elixir_make still verifies against thechecksum.exspinned in the dep. The advisory lane (dialyzer, sobelow, deps.audit, coveralls) stays a localmake qualityrun; seepackages/symphony/docs/quality.md.devShells.<sys>.symphony: parity with the standalone devshell (elixir, erlang, codex, gh, git, openssh).tests/default.nix,symphonygroup) pinning the module's unit env contract (SYMPHONY_WORKFLOW_PACKdefault, primary-repo export, ExecStart shape, EnvironmentFile pass-through, hostRuntime gating) that ix's hil deployment and worker module read.The room-server seam (deliberately untouched)
The
symphonyflake input stays exactly as #780 restored it: pinned to the last rev that still buildsroom-server, feedingpkgs.symphony-room-serverintoimages/dev/symphony-codexand its eval tests. room-server's source now lives in the private ix monorepo, so the public image cannot build it once the symphony repo goes away. Resolving that seam (move the image to ix, or have ix layer room-server onto a public base image) is the remaining blocker for retiring the repo; the input comment inflake.nixnow says so. Do notnix flake update symphonyuntil then: symphony@main no longer exportsroom-server, so a bump breaks the image eval.ix follow-up (after this merges)
inputs.symphony.packages.<sys>.default→inputs.index.packages.<sys>.symphonyinnix/modules/services/host/symphony/module.nixandsymphony-runtime/module.nixinputs.symphony.nixosModules.symphony→inputs.index.nixosModules.symphonyinputs.symphony.packages.<sys>.codex→pkgs.codex(the symphony flake'scodexoutput was a plain re-export for pin visibility)symphonyinput and theinputs.symphony.followsline on itsindexinputsymphonyinput, and archive indexable-inc/symphonyValidation
nix build .#checks.x86_64-linux.symphony-elixir(384 tests, 0 failures, sandboxed)nix build .#ciChecks.x86_64-linux.eval(aggregate includes the newix-test-symphonyand the untouchedix-test-symphony-codex)nix build .#packages.x86_64-linux.symphonyand.#devShells.x86_64-linux.symphonynix eval .#nixosModules --apply builtins.attrNameslistssymphonynix run .#lint(nixfmt, statix, deadnix, ast-grep, ast-grep-test all green)git diff --checkclean🤖 Generated with Claude Code
Note
Move the Symphony Elixir runtime into packages/symphony
symphony_elixirv0.2.0) with its own toolchain, Nix derivation, and CI quality gate.RunGraph,Node,Attempt,Store(JSON-backed persistence),Materializer,Graph(ready-node scheduling and failure propagation),Recovery, and aDynamicSupervisorthat resumes pending runs after restart.Lexer,Parser,Interpreter, andSchema— that compiles.symworkflow files into IR nodes, withWorkflowCataloghot-reloading them from disk./api/v1with run controls and webhook receivers for GitHub, Linear, and Slack.bin/run-nixentrypoint for production deployment, with a newsymphonydeveloper shell in lib/per-system.nix.Macroscope summarized 8b5445b.