Skip to content

feat(controller,ui): experimental Envoy credential injector behind per-instance opt-in flag#346

Merged
pilartomas merged 8 commits intomainfrom
feat/experimental-credential-injector
Apr 28, 2026
Merged

feat(controller,ui): experimental Envoy credential injector behind per-instance opt-in flag#346
pilartomas merged 8 commits intomainfrom
feat/experimental-credential-injector

Conversation

@pilartomas
Copy link
Copy Markdown
Contributor

@pilartomas pilartomas commented Apr 28, 2026

Summary

First slice of the ADR-033 rollout. Add a per-instance opt-in flag (experimentalCredentialInjector) that, when enabled, replaces OneCLI's egress path for that pod with an Envoy sidecar. Existing instances are untouched; off-by-default.

User-clarified scope:

  • No migration. Existing OneCLI-only secrets stay in OneCLI. The api-server's secret-creation path now dual-writes new generic/Anthropic secrets to K8s so the sidecar can discover them. Flagged instances only see secrets created after this lands.
  • Create-instance UI seam lives in add-agent-dialog.tsx (the useCreateInstance hook has no caller today; the create flow runs through useCreateAgent). A live "Experimental" section was added to the configuration panel so users can also toggle the flag on existing instances.

What changed

Data model & contract

  • Go InstanceSpec.ExperimentalCredentialInjector
  • TS Instance / CreateInstanceInput / UpdateInstanceInput + zod
  • Round-trip through instances-service → ConfigMap YAML → parseInfraInstanceassembleInstance

Controller — pod rendering

  • BuildStatefulSet branches on the flag: drops ONECLI_ACCESS_TOKEN + GH_TOKEN sentinel, points HTTP(S)_PROXY at 127.0.0.1:<EnvoyPort>, hard-codes automountServiceAccountToken: false and shareProcessNamespace: false (ADR-033 threat model).
  • New pkg/reconciler/envoy.go: bootstrap YAML renderer, owner-scoped Secret discovery, sidecar container, sidecar-only volume mounts.
  • BuildNetworkPolicy drops the OneCLI peer and allows TCP 443/80 egress when the flag is on. applyNetworkPolicy now updates on diff so toggles propagate.

Envoy bootstrap (TLS interception)

  • Outer listener accepts the agent's CONNECT and tunnels into an internal listener via envoy.bootstrap.internal_listener. The internal listener uses tls_inspector and per-SNI filter chains that terminate the agent's TLS using a per-instance leaf cert, run credential_injector on the now-plaintext HTTP, and re-originate upstream TLS to the real host. SNI-miss requests pass through unmolested via sni_dynamic_forward_proxy.
  • credential_injector doesn't allow per-route config in 1.32, so each TLS-terminating filter chain has its own HCM with its own credential_injector instance — no composite wrapper needed.
  • Upstream TLS validation uses the system CA bundle shipped in the envoy-distroless image.

MITM CA via cert-manager

  • New Helm templates (templates/cert-manager/) bootstrap a self-signed ClusterIssuer, an isCA: true Certificate whose Secret backs humr-mitm-ca-issuer, and the CA ClusterIssuer itself. Gated behind controller.envoyMitm.enabled (default true).
  • Per-instance leaf Certificate ({instance}-envoy-tls) signed by humr-mitm-ca-issuer, with dnsNames = the deduped host-pattern list from the owner's credential Secrets. Controller applies it via the dynamic client. cert-manager produces the leaf Secret asynchronously.
  • The agent's ca-cert volume on the experimental path is now projected from the leaf Secret with items: [{key: ca.crt}] — only the CA cert is exposed; tls.key stays in the sidecar's mount, preserving the credential boundary. The fetch-ca-cert init container is no longer rendered when the flag is on.

API server — secrets dual-write

UI

  • Checkbox in Add Agent dialog (configure step), threaded through useCreateAgentplatform.instances.create.mutate.
  • New experimental-panel.tsx wired into the configuration panel as an "Experimental" section.

Helm

  • controller.envoyImage (now envoyproxy/envoy-distroless:v1.32.0) / controller.envoyPort defaults; passed through the controller deployment env.
  • controller.envoyMitm.{enabled,bootstrapIssuerName,caIssuerName,caSecretNamespace,caSecretName,caDuration,caRenewBefore,leafDuration,leafRenewBefore} for the cert-manager wiring.
  • Controller RBAC extended for cert-manager.io/certificates.

Verified end-to-end on local k3s

  • Pod boots cleanly (no fetch-ca-cert hang), envoy + agent both Running.
  • curl https://api.anthropic.com/v1/models from the agent shows the leaf cert (CN=humr MITM CA, SAN api.anthropic.com), validated against /etc/humr/ca/ca.crt. Anthropic returns a meaningful 401 (OAuth authentication is currently not supported), confirming the Authorization: Bearer … header was injected.
  • curl https://httpbin.org/anything (no Secret) passes through unmodified — no Authorization header in the echoed request, confirming SNI-miss passthrough.

Known limitations

  • Adding a Secret while an instance is running does not propagate to the running pod. The controller's informer only watches ConfigMaps, not Secrets, and the api-server doesn't poke the instance ConfigMap after a Secret write. Even if it did, the Envoy bootstrap is static at process start and adding a per-host filter chain requires re-rolling the pod (which would race with cert-manager re-issuing the leaf with the new SAN list). Until this is plumbed, the user has to manually restart the pod after adding a Secret. Tracked for a follow-up: Secret informer in the controller + Certificate-readiness gate before re-rolling the StatefulSet.

Out of scope (per #337)

OAuth app connections, HITL ext_authz, refresh-token loop, streaming xDS, gVisor / RuntimeClass, audit logging, ADR-027 fork pods, OneCLI removal — tracked as follow-ups.

Test plan

  • mise run check (lint, tsc, go build/vet, helm lint, helm render via kubeconform) — green
  • mise run test — controller, api-server, ui, agent-runtime all green
  • New unit tests:
    • BuildStatefulSet flag-on adds Envoy sidecar, scrubs OneCLI envs, skips fetch-ca-cert init, projects ca.crt only into the agent
    • BuildStatefulSet flag-off unchanged
    • Sidecar-only credential + leaf-TLS volume mounts
    • Pod hardening (automountServiceAccountToken: false, shareProcessNamespace: false)
    • BuildNetworkPolicy drops OneCLI peer and adds 443/80 egress
    • Envoy bootstrap renders the internal listener, tls_inspector, per-SNI termination, dynamic_forward_proxy_https, sni_dynamic_forward_proxy passthrough
    • BuildEnvoyLeafCertificate shape (sorted/deduped DNSNames, IssuerRef, SecretName, owner ref, returns nil on no Secrets)
    • assembleInstance round-trips the flag
  • envoy --mode validate against rendered bootstrap returns OK
  • Live cluster: pod boots, MITM cert chain validated by agent, header injection observed at upstream, SNI-miss passthrough observed

Closes #337

First slice of the ADR-033 rollout. Add a per-instance opt-in flag
(`experimentalCredentialInjector`) that, when enabled, replaces OneCLI's
egress path for that pod with an Envoy sidecar.

- Controller: branch BuildStatefulSet on the flag — render a per-instance
  Envoy bootstrap ConfigMap, mount owner-scoped credential Secrets into
  the sidecar only, drop the agent's `ONECLI_ACCESS_TOKEN`, point
  `HTTP(S)_PROXY` at `127.0.0.1:<EnvoyPort>`, hard-code
  `automountServiceAccountToken: false` and `shareProcessNamespace: false`
  per ADR-033 threat model. NetworkPolicy drops the OneCLI peer and
  allows TCP 443/80 egress when the flag is on.
- API server: dual-write user-typed secrets (generic + Anthropic) to K8s
  Secrets labelled with the owner's sub. OneCLI write path unchanged;
  existing OneCLI-only secrets are not migrated — flagged instances only
  see secrets created after this lands.
- UI: checkbox in Add Agent dialog (configure step) and a new
  Experimental section in the configuration panel for toggling on
  existing instances.
- Helm: `controller.envoyImage` / `controller.envoyPort` defaults.

OAuth app connections, HITL, refresh-token loop, gVisor enforcement,
and OneCLI removal stay out of scope per the issue.

Closes #337

Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
The K8s mirror used a non-existent `headerPrefix` field on `InjectionConfig`
(the real field is `valueFormat: "Bearer {value}"`). Two consequences:

1. Anthropic api-key secrets were mirrored with `Authorization: Bearer <key>`
   instead of `x-api-key: <key>`, so the upstream would reject them.
2. Generic secrets with a custom `valueFormat` (e.g. `Token {value}`) were
   ignored — every mirrored secret got `Bearer <value>` regardless.

Fix:
- Replace `headerPrefix` with the actual `valueFormat` template, applied
  via `{value}` substitution before writing the credential file.
- Special-case Anthropic in `resolveInjection`: read OneCLI's
  `metadata.authMode` from the create response and pick `x-api-key`
  (api-key) or `Authorization: Bearer` (oauth, default).
- Persist `humr.ai/auth-mode` and `humr.ai/injection-value-format`
  annotations so updates can recompute correctly without re-fetching the
  injection config.

`mise run check` does not run tsc on api-server; per-package CI catches
this. Add a unit test suite for the K8s port so the contract is locked
in.

Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
…idecar

The standard envoyproxy/envoy image runs as root by default, which
conflicts with the sidecar's runAsNonRoot: true security context — the
container fails to start. envoyproxy/envoy-distroless ships with USER
set to a non-root account, so it satisfies the policy.

Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
…-host dispatch

Envoy's credential_injector filter rejects virtual-host/route specific
config ("doesn't support virtual host or route specific configurations"),
which the previous bootstrap depended on. Move per-Secret injection into
an envoy.filters.http.composite filter wrapped with ExtensionWithMatcher,
dispatching by :authority. Each Secret becomes one entry in the matcher
map, selecting its own credential_injector instance.

Also add a node id/cluster — the SDS path_config_source for the per-route
credential file requires both, even for file-based sources.

Validated with 'envoy --mode validate' against a rendered bootstrap.

Note: HTTPS-via-CONNECT traffic is encrypted inside the tunnel, so
header injection is currently a no-op for HTTPS upstreams. Adding TLS
interception is tracked as the next slice.

Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
…injector

Closes the two follow-ups from the previous slice:

1. HTTPS injection is no longer a no-op. Envoy now terminates the agent's
   TLS using a per-instance leaf cert signed by a cluster-wide MITM CA, runs
   credential_injector on the plaintext HTTP, and re-originates upstream TLS
   to the real host. SNI-miss requests pass through unmolested via
   sni_dynamic_forward_proxy.

2. The fetch-ca-cert init container is no longer required on the experimental
   path. The agent's CA volume is now projected from the leaf Secret (only
   ca.crt is exposed; tls.key stays in the sidecar — the credential boundary
   between agent and sidecar is preserved).

Mechanics:

- Helm: adds a self-signed bootstrap ClusterIssuer, an isCA Certificate that
  produces the humr-mitm-ca Secret in cert-manager's
  cluster-resource-namespace, and a CA ClusterIssuer that signs leaves.
  Gated behind controller.envoyMitm.enabled (default true).
- Controller: cert-manager.io/v1 types vendored; reconciler builds a per-
  instance Certificate (DNSNames = deduped Secret host-patterns, signed by
  humr-mitm-ca-issuer) and applies via dynamic client. cert-manager produces
  the {instance}-envoy-tls Secret asynchronously.
- Envoy bootstrap: outer CONNECT listener tunnels into an internal listener
  via envoy.bootstrap.internal_listener; tls_inspector + per-SNI filter
  chains terminate TLS using files mounted from the leaf Secret. HCM inside
  each chain runs credential_injector + dynamic_forward_proxy, with upstream
  TLS validated against the system CA bundle shipped in envoy-distroless.

Verified end-to-end on local k3s:

- Pod boots cleanly (no fetch-ca-cert hang).
- curl https://api.anthropic.com/v1/models from the agent shows MITM cert
  issuer 'CN=humr MITM CA' with SAN api.anthropic.com, the request reaches
  Anthropic with the injected Authorization header (Anthropic returns 401
  with a token-type-specific error, not a 'no key' error).
- curl https://httpbin.org/anything (no Secret) passes through unmodified;
  no Authorization header in the echoed request.

The credential file path now points at the SDS DiscoveryResponse the
api-server already writes (sds.yaml key) instead of the raw 'value' file —
path_config_source expects an SDS resource, not bare bytes.

Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
@pilartomas pilartomas marked this pull request as ready for review April 28, 2026 11:25
Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
@xjacka
Copy link
Copy Markdown
Contributor

xjacka commented Apr 28, 2026

🛡️ Humr — Code Review

PR #346: feat(controller,ui): experimental Envoy credential injector behind per-instance opt-in flag

Author: pilartomas | Branch: feat/experimental-credential-injector → main | Changes: +1236 −139 (26 files)

Summary

Introduces an experimental Envoy sidecar credential injector gated behind a per-instance experimentalCredentialInjector opt-in flag. When enabled, egress leaves the OneCLI gateway path: a per-pod Envoy sidecar intercepts outbound HTTP, injecting credentials from K8s Secrets mounted into the sidecar only. The feature spans the controller (new envoy.go with bootstrap ConfigMap rendering and NetworkPolicy rewiring), the api-server (new k8s-secrets-port.ts that mirrors OneCLI secrets into K8s Secrets on create/update/delete), and the UI (checkbox in the Add Agent dialog and a new Experimental section in the Configuration panel). Tests cover the credential injection logic, StatefulSet pod hardening, and bootstrap ConfigMap rendering.

Findings

  • 🔴 Critical: The Envoy bootstrap config injects credentials by matching on :authority header (host:port) inside the composite filter, but as the inline comment at envoy.go:828 acknowledges, "for HTTPS-via-CONNECT this is a no-op until we add TLS interception — the bytes inside the tunnel are encrypted." In practice, nearly all upstream API traffic (Anthropic, GitHub, etc.) is HTTPS, meaning credentials are not injected on the current slice. The UI warning says "OAuth-backed services will not work" but does not communicate that even non-OAuth HTTPS destinations will silently receive no credentials. This is a correctness gap that should be surfaced to users enabling the flag. (packages/controller/pkg/reconciler/envoy.go:828)
  • 🟡 Warning: mirrorToK8s in secrets-service.ts silently swallows K8s write errors with console.warn and continues. On instances with the flag enabled, a failed K8s mirror means the Envoy sidecar never gets the credential — but the user sees a success response from the UI. At minimum, the service should surface a warning in the returned SecretView or emit a structured log with the failing secret ID so on-call can detect broken injection without end-user reports. (packages/api-server/src/modules/secrets/services/secrets-service.ts:612)
  • 🟡 Warning: When experimentalCredentialInjector is true, GH_TOKEN: humr:sentinel is omitted from the agent env (resources.go:1162). This silently breaks any GitHub connector attached to the instance — there is no event or status condition written to indicate that GitHub auth is unavailable. Users who flip the flag on an instance with a GitHub channel will see silent failures. (packages/controller/pkg/reconciler/resources.go:1162)
  • 🟡 Warning: k8sSecretName in k8s-secrets-port.ts lowercases the supplied ID: humr-cred-${id.toLowerCase()}. If two secrets share an ID differing only in case (unlikely with UUIDs, but possible with custom IDs), the second write silently overwrites the first via updateSecret with no conflict guard. (packages/api-server/src/modules/secrets/infrastructure/k8s-secrets-port.ts:517)
  • 🟢 Suggestion: TestBuildStatefulSet_FlagOn_AddsEnvoySidecar passes nil for credentialSecrets — add a variant with a non-empty slice to verify credential volume/mount names are consistent with what the bootstrap template references. (packages/controller/pkg/reconciler/resources_test.go:1521)
  • 🟢 Suggestion: secrets-service.ts has no test for the mirrorToK8s failure path — a unit test verifying that a K8s port error does not cause createSecret to reject would lock in the best-effort contract. (packages/api-server/src/modules/secrets/services/secrets-service.ts)

Documentation Check (doc-drift)

minor drift

docs/architecture/security-and-credentials.md is properly updated: Last verified: bumped to 2026-04-28, ADR-033 added to the Motivated by: list, and a new "Experimental: Envoy credential injector" section documents the sidecar's behavior, boundaries, and limitations.

However, docs/architecture/platform-topology.md:107 still states: "Agent pods never hold real upstream credentials — only a delegated OneCLI access token. Upstream tokens are injected on the wire by OneCLI." This assertion is now conditional — it is only true for instances without the experimental flag. Instances with experimentalCredentialInjector: true have no ONECLI_ACCESS_TOKEN and use an entirely different credential model. This page was not updated in the PR.

Verdict

COMMENT — The HTTPS-via-CONNECT credential injection no-op is a significant usability gap (credentials silently not injected for HTTPS upstreams on the current slice) that should be communicated more clearly to users enabling the flag; the mirrorToK8s silent-failure and GH_TOKEN omission also need observability improvements before this is user-facing.


Review by Humr · automated code guardian

@pilartomas
Copy link
Copy Markdown
Contributor Author

@xjacka the critical finding ("HTTPS-via-CONNECT injection is a no-op") looks stale relative to the current branch tip — could you re-run against 741d070 (or debfa57, the substantive commit)?

Concretely:

  • The review's embedded headRefOid=93c3081c0ff0f85aa1e5eca96b9cee268ad9e34a predates the TLS interception slice (debfa57, pushed ~3 min before the review fired).
  • The quoted comment "for HTTPS-via-CONNECT this is a no-op until we add TLS interception" lived in 93c3081 and was removed by debfa57 when it rewrote the bootstrap. grep against the current envoy.go returns nothing for that string.
  • The current bootstrap implements full MITM: outer CONNECT listener tunnels into an internal listener (envoy.bootstrap.internal_listener), tls_inspector reads SNI, per-SNI filter chains terminate TLS using a per-instance leaf cert (cert-manager-issued, signed by a cluster-wide humr-mitm-ca-issuer), then credential_injector runs on the plaintext HTTP, then dynamic_forward_proxy_https re-originates upstream TLS. SNI-miss requests pass through untouched via sni_dynamic_forward_proxy.

Verified end-to-end on local k3s: a curl https://api.anthropic.com/v1/models from the agent shows the leaf cert (CN=humr MITM CA, SAN api.anthropic.com, validated against the projected ca.crt), the Authorization: Bearer … header arrives at Anthropic, and a curl https://httpbin.org/anything (no Secret) passes through unmodified.

Happy to address the warning-level findings (mirrorToK8s silent swallow, GH_TOKEN omission, humr-cred-${id.toLowerCase()} collision, the doc-drift on platform-topology.md, and the two test gaps) — those all look valid against the current tip. Just want to confirm the critical one is closed before I do.

Six findings from #346 review:

* secrets-service: replace ad-hoc `console.warn` in mirrorToK8s with a
  stable token (`k8s-mirror-failed`) and structured payload (op,
  secretId, error). Log scrapers can now alert on broken K8s mirroring
  (which silently breaks Envoy injection on the experimental path)
  without parsing free-form text.
* k8s-secrets-port: drop the defensive `.toLowerCase()` from
  k8sSecretName and validate the ID up-front against RFC 1123. Two IDs
  differing only in case can no longer silently overwrite each other; an
  invalid ID throws (caught by mirrorToK8s, not propagated to OneCLI).
* controller: when ExperimentalCredentialInjector is on, log a warning
  if no GitHub credential Secret is attached. The OneCLI GH_TOKEN
  sentinel is dropped on this path, so without a BYO credential
  gh/octokit silently lose auth — this surfaces it in operator logs.
* resources_test: TestBuildStatefulSet_FlagOn_AddsEnvoySidecar now uses
  a non-empty credentialSecrets slice and asserts that volume + mount
  names match what the bootstrap template references (`cred-<name>`,
  `/etc/envoy/credentials/<name>`, `/etc/envoy/tls`).
* secrets-service.test (new): unit tests verifying create/update/delete
  resolve successfully when the K8s mirror throws, that the failure is
  logged with the structured payload, and that the mirror is skipped
  entirely when k8sPort is undefined.
* platform-topology.md: rewrite the credential-isolation invariant to
  cover both paths (OneCLI MITM, Envoy sidecar with per-instance leaf)
  and add ADR-033 to Motivated by. The previous wording asserted agents
  never hold upstream credentials, but elided that the experimental
  path achieves this differently.

Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
@xjacka
Copy link
Copy Markdown
Contributor

xjacka commented Apr 28, 2026

🛡️ Humr — Code Review

PR #346: feat(controller,ui): experimental Envoy credential injector behind per-instance opt-in flag

Author: pilartomas | Branch: feat/experimental-credential-injector → main | Changes: +578 −239 (35 files)

Summary

Extends the experimental Envoy credential injector with full TLS MITM infrastructure: cert-manager CA + ClusterIssuers via Helm chart, per-instance leaf Certificate management in leaf.go, and an updated Envoy bootstrap that terminates the agent's TLS and re-originates upstream connections — completing the injection path for HTTPS traffic. Also adds the K8s Secrets mirror port for credential persistence, structured failure logging in mirrorToK8s, and fixes the k8sSecretName validation gap.

Changes since last review

Previous HEAD: 93c3081 (2026-04-28T10:00:00Z) — verdict COMMENT

  • Fixed: HTTPS-via-CONNECT TLS no-op (envoy.go:828) — cert-manager MITM CA + leaf certs now fully wired; Envoy terminates agent TLS and re-originates upstream TLS on the experimental path
  • Fixed: mirrorToK8s silent failure (secrets-service.ts:612) — now logs with stable token [secrets-service] k8s-mirror-failed and structured JSON (op, secretId, error); test verifies payload shape
  • Fixed: k8sSecretName case-collision (k8s-secrets-port.ts:517) — RFC 1123 regex validation replaces toLowerCase() silent coercion; throws hard error on invalid IDs
  • Fixed: TestBuildStatefulSet_FlagOn_AddsEnvoySidecar nil credentialSecrets — now passes a real credSecret and verifies volume/mount name consistency with the bootstrap template
  • Fixed: Missing secrets-service.ts K8s mirror failure path test — secrets-service.test.ts added with 104 lines of coverage
  • 🔁 Still present: GH_TOKEN omission silent breakage (resources.go:62-68) — UI warning now mentions "OAuth-backed services (GitHub, Slack, Google) will not work", which is an improvement, but agent-side gh CLI and GitHub API failures remain silent with no status condition or env var signal

Findings

  • 🟡 Warning: Agent-side runtime failures for GH_TOKEN-dependent tooling (gh CLI, octokit, any GH_TOKEN-aware tool) remain silent when experimentalCredentialInjector is true. The UI warning is user-facing but the agent itself has no signal: no status condition is written, no env var is set to indicate missing GitHub auth. A pod annotation or ConfigMap status field noting ghTokenAvailable: false would let tooling self-report rather than silently fail. (packages/controller/pkg/reconciler/resources.go:62)
  • 🟢 Suggestion: The agent's ca-cert volume is now projected from the leaf Secret (<instance>-envoy-tls) exposing only ca.crt — the MITM CA root. There is no integration test verifying the chain: that Envoy's leaf cert is actually issued under the same CA the agent imports. Consider a dev-mode sanity check that validates the trust chain on pod start. (packages/controller/pkg/reconciler/resources_test.go)
  • Looks good: leaf_test.go covers the nil-host return path, dedup/sort of SAN list, IssuerRef wiring, and OwnerReferences — solid coverage of the cert rendering logic.
  • Looks good: TestBuildStatefulSet_FlagOn_SecretMountsSidecarOnly explicitly verifies the credential isolation boundary: agent container never mounts credential Secrets, only ca.crt is visible to the agent (not tls.key). This is the critical security invariant for the sidecar model.
  • Looks good: k8s-secrets-port.test.ts (196 lines) covers resolveInjection, injectionFileContent, createSecret header baking (anthropic api-key/oauth/generic), and updateSecret value rotation.

Documentation Check (doc-drift)

Verdict: Minor drift

Both architecture pages are updated and current (Last verified: 2026-04-28). One remaining gap:

Check 1 — Architecture-page drift:
docs/architecture/platform-topology.md controller description states it "watches ConfigMaps labelled humr.ai/type... reconciles the StatefulSet, Service, NetworkPolicy, and per-agent Secret." When experimentalCredentialInjector is enabled, the controller now also reconciles cert-manager.io/Certificate resources for the Envoy MITM leaf certs. The controller description should mention this conditional responsibility.

Verdict

APPROVE — the critical HTTPS no-op is fully resolved via the TLS MITM chain; all prior warnings addressed; remaining GH_TOKEN runtime observability gap is minor and documented at the UI layer.


Review by Humr · automated code guardian

Follow-up to the previous review pass — operator-side log warnings weren't
enough; the agent itself needs a signal so GH_TOKEN-aware tooling (gh CLI,
octokit, wrapper scripts) can short-circuit instead of failing on a
mid-request 401.

When ExperimentalCredentialInjector=true:

- Set HUMR_GH_TOKEN_AVAILABLE="true"|"false" on the agent container env.
  "true" iff the owner has a credential Secret with host-pattern
  github.com or api.github.com (Envoy will inject Authorization on the
  wire); "false" otherwise.
- Mirror the same value to a pod annotation
  humr.ai/gh-token-available — operators can grep for the missing case
  via 'kubectl get pods -o jsonpath="{...annotations.humr\.ai/gh-token-available}"'
  without poking inside the container.

Off the experimental path, neither is set (the OneCLI sentinel mechanism
is unchanged). Tests cover both flag-on cases and confirm flag-off stays
clean. security-and-credentials.md documents the signal.

Signed-off-by: Tomas Pilar <thomas7pilar@gmail.com>
@xjacka
Copy link
Copy Markdown
Contributor

xjacka commented Apr 28, 2026

🛡️ Humr — Code Review

PR #346: feat(controller,ui): experimental Envoy credential injector behind per-instance opt-in flag

Author: pilartomas | Branch: feat/experimental-credential-injector → main | Changes: +2054 −242 (37 files)

Summary

Third review. This commit adds the Helm chart's cert-manager infrastructure (CA Certificate, ClusterIssuers, RBAC, envoyMitm values block), wires the dynamic client into the controller for cert-manager Certificate reconciliation, resolves the prior GH_TOKEN observability gap with HUMR_GH_TOKEN_AVAILABLE env var and humr.ai/gh-token-available pod annotation, fully updates both architecture docs pages to reflect the MITM model and new signals, and adds a comprehensive test suite (k8s-secrets-port 196L, secrets-service 104L, instance-assembly round-trip tests).

Changes since last review

Previous HEAD: 741d070 (2026-04-28T13:00:00Z) — verdict APPROVE

  • Fixed: GH_TOKEN runtime observability gap (resources.go:271-279) — HUMR_GH_TOKEN_AVAILABLE env var and humr.ai/gh-token-available pod annotation now set on experimental-flag instances, sourced from hasGitHubCredential() over mounted K8s Secrets
  • 🆕 New: Helm cert-manager infrastructure added (deploy/helm/humr/templates/cert-manager/) — CA Certificate, bootstrap + CA ClusterIssuers, RBAC, envoyMitm values block
  • 🆕 New: Dynamic client wired into controller (main.go, instance.go) enabling applyCertificate for per-instance leaf certs on the experimental path
  • 🆕 New: Architecture docs updated — both pages reflect the full MITM model; security-and-credentials.md now documents HUMR_GH_TOKEN_AVAILABLE and humr.ai/gh-token-available signals
  • 🆕 New: Expanded test suite — k8s-secrets-port.test.ts (196L), secrets-service.test.ts (104L), instance-assembly.test.ts round-trip for experimentalCredentialInjector
  • 🔁 Still present: platform-topology.md:47 controller description omits cert-manager Certificate reconciliation — carried from previous review
  • 🔁 Still present: Integration test for cert chain trust validation (resources_test.go) — suggestion only, still absent

Findings

  • 🟡 Warning: envoyMitm.enabled: true is the default in values.yaml, so the cert-manager ClusterIssuer + CA Certificate templates render by default on helm upgrade. Clusters without cert-manager CRDs will see those resources fail on apply. Users upgrading from a prior chart version without cert-manager must explicitly set controller.envoyMitm.enabled: false or install cert-manager first — the upgrade path is undocumented. (deploy/helm/humr/values.yaml, envoyMitm.enabled)
  • 🟡 Warning: envoySidecarContainer sets CPU/memory Requests but no Limits (packages/controller/pkg/reconciler/envoy.go around envoySidecarContainer). Without Limits the Envoy sidecar can consume unbounded CPU/memory on the node, starving the agent container. Adding Limits (e.g. cpu: 500m, memory: 128Mi) would enforce the resource boundary the sidecar security model depends on.
  • 🟢 Suggestion: applyCertificate returns a hard error when r.dynamic == nil (packages/controller/pkg/reconciler/instance.go). A one-line doc comment on WithDynamicClient noting it is required before the experimental path is activated would help future test authors avoid a confusing error when the dynamic client mock is missing.
  • 🟢 Suggestion: When a user attaches their first credential Secret after a flag-on pod is already running, the pod needs a restart to mount the new volume (since envoySidecarVolumes adds the leaf TLS volume only when len(secrets) > 0). A pod annotation tracking credential count (e.g. humr.ai/credential-count) would let operators and tooling detect the restart-required state without spelunking the pod spec.
  • Looks good: HUMR_GH_TOKEN_AVAILABLE set on containers[0] (agent container, not sidecar) with "true"/"false" from hasGitHubCredential(), mirrored to podAnnotations["humr.ai/gh-token-available"]. Both signals documented accurately in security-and-credentials.md.
  • Looks good: applyNetworkPolicy upgraded to retry.RetryOnConflict with in-place spec update — flag toggles propagate to live NetworkPolicies without delete+recreate cycles.
  • Looks good: dnsNamesFromRoutes deduplicates and sorts DNS names before building the leaf Certificate SAN list, keeping the Certificate spec stable across reconciles and preventing spurious cert-manager renewals.

Documentation Check (doc-drift)

Verdict: minor drift

Both architecture pages are updated and current (Last verified: 2026-04-28). ADR-033 added to Motivated by: in both pages. The new Experimental Envoy section in security-and-credentials.md accurately documents HUMR_GH_TOKEN_AVAILABLE and humr.ai/gh-token-available.

One remaining gap carried from the previous review:

Check 1 — Architecture-page drift: docs/architecture/platform-topology.md:47 controller description still states "reconciles the StatefulSet, Service, NetworkPolicy, and per-agent Secret for each instance" — it does not mention that, when experimentalCredentialInjector is enabled, the controller also reconciles cert-manager.io/v1 Certificate resources for the Envoy MITM leaf cert. Proposed addition: "...and, when experimentalCredentialInjector is enabled, a cert-manager Certificate for the Envoy MITM leaf cert."

Verdict

APPROVE — the GH_TOKEN observability gap from the previous review is fully resolved; the Helm cert-manager infrastructure is complete; the two new warnings (default envoyMitm.enabled: true upgrade risk, no sidecar resource Limits) are minor and do not block the experimental feature's purpose.


Review by Humr · automated code guardian

@pilartomas pilartomas merged commit 3acd7a5 into main Apr 28, 2026
12 checks passed
@pilartomas pilartomas deleted the feat/experimental-credential-injector branch April 28, 2026 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(controller,ui): experimental Envoy credential injector behind per-instance opt-in flag

2 participants