feat(controller): credential-injector follow-ups — TLS, GH_TOKEN signal, CD fixes#356
Closed
pilartomas wants to merge 9 commits intomainfrom
Closed
feat(controller): credential-injector follow-ups — TLS, GH_TOKEN signal, CD fixes#356pilartomas wants to merge 9 commits intomainfrom
pilartomas wants to merge 9 commits intomainfrom
Conversation
First slice of the ADR-033 rollout. Add a per-instance opt-in flag (`experimentalCredentialInjector`) that, when enabled, replaces OneCLI's egress path for that pod with an Envoy sidecar. - Controller: branch BuildStatefulSet on the flag — render a per-instance Envoy bootstrap ConfigMap, mount owner-scoped credential Secrets into the sidecar only, drop the agent's `ONECLI_ACCESS_TOKEN`, point `HTTP(S)_PROXY` at `127.0.0.1:<EnvoyPort>`, hard-code `automountServiceAccountToken: false` and `shareProcessNamespace: false` per ADR-033 threat model. NetworkPolicy drops the OneCLI peer and allows TCP 443/80 egress when the flag is on. - API server: dual-write user-typed secrets (generic + Anthropic) to K8s Secrets labelled with the owner's sub. OneCLI write path unchanged; existing OneCLI-only secrets are not migrated — flagged instances only see secrets created after this lands. - UI: checkbox in Add Agent dialog (configure step) and a new Experimental section in the configuration panel for toggling on existing instances. - Helm: `controller.envoyImage` / `controller.envoyPort` defaults. OAuth app connections, HITL, refresh-token loop, gVisor enforcement, and OneCLI removal stay out of scope per the issue. Closes #337 Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
The K8s mirror used a non-existent `headerPrefix` field on `InjectionConfig`
(the real field is `valueFormat: "Bearer {value}"`). Two consequences:
1. Anthropic api-key secrets were mirrored with `Authorization: Bearer <key>`
instead of `x-api-key: <key>`, so the upstream would reject them.
2. Generic secrets with a custom `valueFormat` (e.g. `Token {value}`) were
ignored — every mirrored secret got `Bearer <value>` regardless.
Fix:
- Replace `headerPrefix` with the actual `valueFormat` template, applied
via `{value}` substitution before writing the credential file.
- Special-case Anthropic in `resolveInjection`: read OneCLI's
`metadata.authMode` from the create response and pick `x-api-key`
(api-key) or `Authorization: Bearer` (oauth, default).
- Persist `humr.ai/auth-mode` and `humr.ai/injection-value-format`
annotations so updates can recompute correctly without re-fetching the
injection config.
`mise run check` does not run tsc on api-server; per-package CI catches
this. Add a unit test suite for the K8s port so the contract is locked
in.
Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
…idecar The standard envoyproxy/envoy image runs as root by default, which conflicts with the sidecar's runAsNonRoot: true security context — the container fails to start. envoyproxy/envoy-distroless ships with USER set to a non-root account, so it satisfies the policy. Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
…-host dispatch
Envoy's credential_injector filter rejects virtual-host/route specific
config ("doesn't support virtual host or route specific configurations"),
which the previous bootstrap depended on. Move per-Secret injection into
an envoy.filters.http.composite filter wrapped with ExtensionWithMatcher,
dispatching by :authority. Each Secret becomes one entry in the matcher
map, selecting its own credential_injector instance.
Also add a node id/cluster — the SDS path_config_source for the per-route
credential file requires both, even for file-based sources.
Validated with 'envoy --mode validate' against a rendered bootstrap.
Note: HTTPS-via-CONNECT traffic is encrypted inside the tunnel, so
header injection is currently a no-op for HTTPS upstreams. Adding TLS
interception is tracked as the next slice.
Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
…injector
Closes the two follow-ups from the previous slice:
1. HTTPS injection is no longer a no-op. Envoy now terminates the agent's
TLS using a per-instance leaf cert signed by a cluster-wide MITM CA, runs
credential_injector on the plaintext HTTP, and re-originates upstream TLS
to the real host. SNI-miss requests pass through unmolested via
sni_dynamic_forward_proxy.
2. The fetch-ca-cert init container is no longer required on the experimental
path. The agent's CA volume is now projected from the leaf Secret (only
ca.crt is exposed; tls.key stays in the sidecar — the credential boundary
between agent and sidecar is preserved).
Mechanics:
- Helm: adds a self-signed bootstrap ClusterIssuer, an isCA Certificate that
produces the humr-mitm-ca Secret in cert-manager's
cluster-resource-namespace, and a CA ClusterIssuer that signs leaves.
Gated behind controller.envoyMitm.enabled (default true).
- Controller: cert-manager.io/v1 types vendored; reconciler builds a per-
instance Certificate (DNSNames = deduped Secret host-patterns, signed by
humr-mitm-ca-issuer) and applies via dynamic client. cert-manager produces
the {instance}-envoy-tls Secret asynchronously.
- Envoy bootstrap: outer CONNECT listener tunnels into an internal listener
via envoy.bootstrap.internal_listener; tls_inspector + per-SNI filter
chains terminate TLS using files mounted from the leaf Secret. HCM inside
each chain runs credential_injector + dynamic_forward_proxy, with upstream
TLS validated against the system CA bundle shipped in envoy-distroless.
Verified end-to-end on local k3s:
- Pod boots cleanly (no fetch-ca-cert hang).
- curl https://api.anthropic.com/v1/models from the agent shows MITM cert
issuer 'CN=humr MITM CA' with SAN api.anthropic.com, the request reaches
Anthropic with the injected Authorization header (Anthropic returns 401
with a token-type-specific error, not a 'no key' error).
- curl https://httpbin.org/anything (no Secret) passes through unmodified;
no Authorization header in the echoed request.
The credential file path now points at the SDS DiscoveryResponse the
api-server already writes (sds.yaml key) instead of the raw 'value' file —
path_config_source expects an SDS resource, not bare bytes.
Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
Six findings from #346 review: * secrets-service: replace ad-hoc `console.warn` in mirrorToK8s with a stable token (`k8s-mirror-failed`) and structured payload (op, secretId, error). Log scrapers can now alert on broken K8s mirroring (which silently breaks Envoy injection on the experimental path) without parsing free-form text. * k8s-secrets-port: drop the defensive `.toLowerCase()` from k8sSecretName and validate the ID up-front against RFC 1123. Two IDs differing only in case can no longer silently overwrite each other; an invalid ID throws (caught by mirrorToK8s, not propagated to OneCLI). * controller: when ExperimentalCredentialInjector is on, log a warning if no GitHub credential Secret is attached. The OneCLI GH_TOKEN sentinel is dropped on this path, so without a BYO credential gh/octokit silently lose auth — this surfaces it in operator logs. * resources_test: TestBuildStatefulSet_FlagOn_AddsEnvoySidecar now uses a non-empty credentialSecrets slice and asserts that volume + mount names match what the bootstrap template references (`cred-<name>`, `/etc/envoy/credentials/<name>`, `/etc/envoy/tls`). * secrets-service.test (new): unit tests verifying create/update/delete resolve successfully when the K8s mirror throws, that the failure is logged with the structured payload, and that the mirror is skipped entirely when k8sPort is undefined. * platform-topology.md: rewrite the credential-isolation invariant to cover both paths (OneCLI MITM, Envoy sidecar with per-instance leaf) and add ADR-033 to Motivated by. The previous wording asserted agents never hold upstream credentials, but elided that the experimental path achieves this differently. Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
Follow-up to the previous review pass — operator-side log warnings weren't
enough; the agent itself needs a signal so GH_TOKEN-aware tooling (gh CLI,
octokit, wrapper scripts) can short-circuit instead of failing on a
mid-request 401.
When ExperimentalCredentialInjector=true:
- Set HUMR_GH_TOKEN_AVAILABLE="true"|"false" on the agent container env.
"true" iff the owner has a credential Secret with host-pattern
github.com or api.github.com (Envoy will inject Authorization on the
wire); "false" otherwise.
- Mirror the same value to a pod annotation
humr.ai/gh-token-available — operators can grep for the missing case
via 'kubectl get pods -o jsonpath="{...annotations.humr\.ai/gh-token-available}"'
without poking inside the container.
Off the experimental path, neither is set (the OneCLI sentinel mechanism
is unchanged). Tests cover both flag-on cases and confirm flag-off stays
clean. security-and-credentials.md documents the signal.
Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
…otations - Envoy sidecar image bumped from envoyproxy/envoy-distroless:v1.32.0 (2024-10-15, blocked by StackRox >1y policy) to envoyproxy/envoy:distroless-v1.37.2 (2026-04-10). Upstream stopped publishing to envoyproxy/envoy-distroless; distroless variants now live under envoyproxy/envoy:distroless-* tags. - New controller.agentPodAnnotations Helm value, propagated to the controller via AGENT_POD_ANNOTATIONS (JSON) and stamped on every agent pod. Lets operators attach admission-webhook break-glass annotations (e.g. admission.stackrox.io/break-glass) without a code change next time a cluster policy fires. Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
Contributor
Author
|
Reopening on a clean branch off main. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up work on the experimental Envoy credential injector merged in #346.
envoyproxy/envoy:distroless-v1.37.2. The previous tag (envoyproxy/envoy-distroless:v1.32.0, 2024-10-15) tripped the StackRox >1y stale-image policy and blocked StatefulSet creation in the IBM Cloud cluster. Upstream stopped pushing toenvoyproxy/envoy-distroless; distroless variants now ship underenvoyproxy/envoy:distroless-*.controller.agentPodAnnotationsvalue, propagated viaAGENT_POD_ANNOTATIONS(JSON) and stamped on every agent pod. Lets operators attach admission-webhook break-glass annotations (e.g.admission.stackrox.io/break-glass) without a code change next time a cluster policy fires.humr.ai/gh-token-availableannotation +HUMR_GH_TOKEN_AVAILABLEenv var on agents using the experimental path, so wrapper scripts can short-circuit instead of probing for a 401.credential_injectorin a composite filter so per-host dispatch works.Test plan
mise run checkpassesmise run testpassesexperimentalCredentialInjector: trueand confirm Envoy sidecar pulls and runskubectl get pod ... -o yamlshowshumr.ai/gh-token-availableannotation