Skip to content

feat(ecs): resolve containerDefinitions[].secrets from SSM / Secrets Manager#1687

Open
hampsterx wants to merge 1 commit into
floci-io:mainfrom
hampsterx:feat/ecs-container-secrets
Open

feat(ecs): resolve containerDefinitions[].secrets from SSM / Secrets Manager#1687
hampsterx wants to merge 1 commit into
floci-io:mainfrom
hampsterx:feat/ecs-container-secrets

Conversation

@hampsterx

Copy link
Copy Markdown
Contributor

Closes #1624.

What

Support containerDefinitions[].secrets end to end: parse and round-trip the Secret references on register/describe, and inject the resolved values as environment variables when a task launches. Previously secrets was dropped on register and never resolved, so containers came up without the env vars their task definition declares via secrets.

How

  • Register / describe: EcsJsonHandler parses secrets[] (name + valueFrom) into the container definition and emits them back on RegisterTaskDefinition / DescribeTaskDefinition as references. Resolved values are never returned, matching AWS.
  • Launch: EcsContainerManager resolves each secret through the in-process SsmService / SecretsManagerService, the same local resolution CodeBuildRunner already uses (no extra HTTP hops):
    • SSM parameter by name or full ARN → Parameter.value
    • Secrets Manager full ARN → SecretString
    • a full ARN's own region is used, so cross-region references resolve against the right store instead of the task's region
  • Precedence: task-def environment < resolved secrets < RunTask container-override environment. Overrides still win on a name collision.

Failure behavior

Per the design question in #1624, this matches AWS rather than CodeBuild's lenient skip: if a referenced secret/parameter can't be resolved, no container is started and the task goes STOPPED with a ResourceInitializationError stopped reason (naming the offending valueFrom), instead of silently starting without the variable.

Deferred

The Secrets Manager valueFrom selector suffix (:json-key:version-stage:version-id) is not yet parsed; such a valueFrom is resolved as the plain secret. Called out as deferrable in #1624; happy to follow up.

Test plan

  • EcsIntegrationTest (register/describe round-trip of secrets)
  • EcsContainerManagerSecretsTest (SSM name + SM ARN resolution, override-wins, SSM-ARN name parsing, cross-region ARN region, unresolved → propagates with no container created)
  • EcsServiceContainerSecretsTest (unresolved secret → task STOPPED with ResourceInitializationError reason)
  • EcsTests Java SDK compatibility: register + describe round-trips the Secret
  • Full ECS suite green locally (./mvnw test -Dtest='Ecs*')

RegisterTaskDefinition now parses and stores containerDefinitions[].secrets
(name + valueFrom) and round-trips them on register/describe as references,
matching AWS (the resolved values are never returned).

At RunTask launch, EcsContainerManager resolves each secret to an environment
variable through the in-process SsmService / SecretsManagerService (the same
local resolution CodeBuildRunner already uses, no extra HTTP hops):

- SSM parameter by name or full ARN -> Parameter.value
- Secrets Manager full ARN -> SecretString
- a full ARN's own region is used, so cross-region references resolve correctly
- env precedence: task-def environment < resolved secrets < RunTask container
  override environment (override still wins on a name collision)

When a referenced secret/parameter cannot be resolved the launch fails the way
AWS does: no container is started and the task goes STOPPED with a
ResourceInitializationError stoppedReason, rather than silently starting the
container without the variable.

Deferred (noted in floci-io#1624): the Secrets Manager valueFrom selector
suffix (:json-key:version-stage:version-id) is not yet parsed; such a valueFrom
is resolved as the plain secret.

Adds handler round-trip tests, container-manager resolution / override /
cross-region unit tests, a stopped-task failure test, and Java SDK
compatibility coverage.
@greptile-apps

greptile-apps Bot commented Jul 2, 2026

Copy link
Copy Markdown

Greptile Summary

This PR implements end-to-end support for containerDefinitions[].secrets in Floci's ECS emulator: secrets are now parsed and round-tripped through register/describe, and resolved at task launch via the in-process SSM and Secrets Manager services. Unresolvable secrets fail fast before any container starts, matching AWS's ResourceInitializationError behaviour.

  • Parse/emit: EcsJsonHandler reads secrets[] from the request JSON and emits them back unchanged on register and describe, never returning resolved values (matching the AWS wire format).
  • Resolution at launch: EcsContainerManager.buildEnvVars resolves each secret through SsmService/SecretsManagerService before the container loop starts; any failure rethrows as ResourceInitializationError and no containers are created.
  • Precedence: task-def environment < resolved secrets < RunTask container-override environment, enforced via ordered LinkedHashMap insertion.

Confidence Score: 3/5

The core register/describe/parse path is solid, but the secret resolution at launch has a gap that can silently inject a wrong value into a running container.

When a Secrets Manager secret is stored as binary-only (no SecretString), getSecretString() returns null. That null flows through buildEnvVars and is formatted as the literal four-character string 'null' in the container environment instead of halting launch with ResourceInitializationError. The same catch (AwsException e) block also won't intercept a NullPointerException if getSecretValue() itself returns null, so the unchecked exception bypasses the error-wrapping path entirely. Both cases contradict the explicit fail-fast design goal of the PR.

src/main/java/io/github/hectorvent/floci/services/ecs/container/EcsContainerManager.java — specifically the resolveSecretValue method and the null-safety of its return value in buildEnvVars.

Important Files Changed

Filename Overview
src/main/java/io/github/hectorvent/floci/services/ecs/container/EcsContainerManager.java Adds secret resolution at task launch; the fail-fast pre-resolution loop and cross-region ARN handling are correct, but null SecretString from a binary Secrets Manager secret is silently formatted as the literal string "null" rather than raising ResourceInitializationError.
src/main/java/io/github/hectorvent/floci/services/ecs/EcsJsonHandler.java Adds secrets[] serialisation/deserialisation; parse and emit are symmetric and never leak resolved values, matching AWS behaviour.
src/main/java/io/github/hectorvent/floci/services/ecs/model/Secret.java New record type for secret references; minimal and correct.
src/main/java/io/github/hectorvent/floci/services/ecs/model/ContainerDefinition.java Adds List secrets field with getter/setter; no issues.
src/test/java/io/github/hectorvent/floci/services/ecs/container/EcsContainerManagerSecretsTest.java Good unit test coverage for SSM name, SM ARN, cross-region ARN, override precedence, and unresolved secret; does not cover binary (null SecretString) case.
src/test/java/io/github/hectorvent/floci/services/ecs/EcsServiceContainerSecretsTest.java Integration-style test verifying STOPPED task with ResourceInitializationError when containerManager.startTask() throws AwsException.
src/test/java/io/github/hectorvent/floci/services/ecs/EcsIntegrationTest.java Adds register/describe round-trip test for secrets; order renumbering is consistent with the insertion of the new @order(16) test.
compatibility-tests/sdk-test-java/src/test/java/com/floci/test/EcsTests.java Adds SDK-level register+describe round-trip for secrets; straightforward and correct.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Client
    participant EcsJsonHandler
    participant EcsService
    participant EcsContainerManager
    participant SsmService
    participant SecretsManagerService
    participant Docker

    Client->>EcsJsonHandler: RunTask (task def with secrets[])
    EcsJsonHandler->>EcsService: runTask(...)
    EcsService->>EcsContainerManager: startTask(task, taskDef, overrides, region)

    Note over EcsContainerManager: Pre-resolution pass (fail-fast)
    loop For each ContainerDefinition
        EcsContainerManager->>EcsContainerManager: buildEnvVars(def, override, region)
        loop For each secret
            alt valueFrom starts with arn:aws:secretsmanager:
                EcsContainerManager->>SecretsManagerService: getSecretValue(arn, region)
                SecretsManagerService-->>EcsContainerManager: SecretVersion.getSecretString()
            else SSM name or ARN
                EcsContainerManager->>SsmService: getParameter(name, region)
                SsmService-->>EcsContainerManager: Parameter.getValue()
            end
        end
    end

    alt Any secret resolution fails
        EcsContainerManager-->>EcsService: throws AwsException (ResourceInitializationError)
        EcsService-->>Client: task STOPPED with stoppedReason
    else All secrets resolved
        loop For each ContainerDefinition
            EcsContainerManager->>Docker: createAndStart(containerSpec with env vars)
            Docker-->>EcsContainerManager: ContainerInfo
        end
        EcsContainerManager-->>EcsService: EcsTaskHandle
        EcsService-->>Client: tasks[] RUNNING
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Client
    participant EcsJsonHandler
    participant EcsService
    participant EcsContainerManager
    participant SsmService
    participant SecretsManagerService
    participant Docker

    Client->>EcsJsonHandler: RunTask (task def with secrets[])
    EcsJsonHandler->>EcsService: runTask(...)
    EcsService->>EcsContainerManager: startTask(task, taskDef, overrides, region)

    Note over EcsContainerManager: Pre-resolution pass (fail-fast)
    loop For each ContainerDefinition
        EcsContainerManager->>EcsContainerManager: buildEnvVars(def, override, region)
        loop For each secret
            alt valueFrom starts with arn:aws:secretsmanager:
                EcsContainerManager->>SecretsManagerService: getSecretValue(arn, region)
                SecretsManagerService-->>EcsContainerManager: SecretVersion.getSecretString()
            else SSM name or ARN
                EcsContainerManager->>SsmService: getParameter(name, region)
                SsmService-->>EcsContainerManager: Parameter.getValue()
            end
        end
    end

    alt Any secret resolution fails
        EcsContainerManager-->>EcsService: throws AwsException (ResourceInitializationError)
        EcsService-->>Client: task STOPPED with stoppedReason
    else All secrets resolved
        loop For each ContainerDefinition
            EcsContainerManager->>Docker: createAndStart(containerSpec with env vars)
            Docker-->>EcsContainerManager: ContainerInfo
        end
        EcsContainerManager-->>EcsService: EcsTaskHandle
        EcsService-->>Client: tasks[] RUNNING
    end
Loading

Reviews (1): Last reviewed commit: "feat(ecs): resolve containerDefinitions[..." | Re-trigger Greptile

Comment on lines +359 to +379
private String resolveSecretValue(String valueFrom, String region) {
// A full ARN carries its own region; use it so cross-region references resolve
// against the right store instead of the task's region. Bare SSM names fall back
// to the task region.
String secretRegion = arnRegion(valueFrom, region);
try {
if (valueFrom != null && valueFrom.startsWith("arn:aws:secretsmanager:")) {
// The ECS `:json-key:version-stage:version-id` selector suffix on a
// Secrets Manager valueFrom is not yet supported: the value is resolved
// as the plain secret (ARN -> SecretString). See floci-io/floci#1624.
return secretsManagerService.getSecretValue(valueFrom, null, null, secretRegion).getSecretString();
}
String parameterName = ssmParameterName(valueFrom);
return ssmService.getParameter(parameterName, secretRegion).getValue();
} catch (AwsException e) {
throw new AwsException(e.getErrorCode(),
"ResourceInitializationError: unable to pull secrets or registry auth: "
+ valueFrom + ": " + e.getMessage(),
e.getHttpStatus());
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Binary secret silently injects literal "null" as env var

secretsManagerService.getSecretValue(...) can return a SecretVersion whose getSecretString() is null when the secret was stored as binary-only. In that case resolveSecretValue returns null, which propagates to buildEnvVars where it is formatted as entry.getKey() + "=" + entry.getValue() — producing "DB_PASSWORD=null" (the literal four-character string) in the container environment. The design contract states a bad secret must halt launch with ResourceInitializationError, but this path starts the container silently with a wrong value.

Additionally, the catch (AwsException e) block won't intercept a NullPointerException thrown if getSecretValue() itself returns null — that unchecked exception would bypass the ResourceInitializationError wrapping and propagate raw out of startTask.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEAT] ECS: inject containerDefinitions[].secrets from Secrets Manager / SSM

1 participant