Skip to content

Commit 690b5b8

Browse files
SiarheiFedziukovichclaude
andcommitted
docs(dial-unified-config): propose extraData duality split (OQ-34, U.5)
Records the core-team concern that Upstream.extraData is a wide property (more often non-secret than secret, sometimes mixing both in one JSON document) while the post-U.4 state encrypts it wholesale with no read path. Proposed design: plaintext extraData + encrypted secretExtraData with merge-at-consumption on the X-UPSTREAM-EXTRA-DATA header. Status PROPOSED, not locked - full design in 04 §2.11, OQ-34 open, slice U.5 registered ⏸ blocked pending core-team decision; no locked text amended. Design anchors: 04 §2.11 (new, proposed); 08 OQ-34 (new, open) Tests: no new tests (docs-only) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
1 parent 3458b73 commit 690b5b8

4 files changed

Lines changed: 44 additions & 2 deletions

File tree

docs/sandbox/dial-unified-config-lite/04-security.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,8 @@ Preserve-on-omit is server-side behavior, not a CLI ergonomic — every client (
6161

6262
**Reveal flow retired.** Slice U.4 (2026-05-25) postpones the `?reveal_secrets=true` query parameter and the `security-admin` role per DIAL-team review-call decision. There is no plaintext-read path on the API in the MVP. Operators inspect secret values via the underlying KMS / blob layer. The design draft (full §2.6) is preserved for re-introduction in a later phase.
6363

64+
**Proposed (2026-06-03, pending core-team decision — OQ-34): `extraData` duality split.** `Upstream.extraData` is a wide property — more often non-secret (`{"region":"us-east-1"}`) than secret (Bedrock IAM credentials), sometimes both in one document — yet the current design encrypts it wholesale and (post-U.4) drops it from every GET with no reveal path, making the common case un-inspectable. The proposed fix splits the field: `extraData` returns to plaintext + Owner-visible; a new `secretExtraData` keeps today's encrypted / write-only contract; the `X-UPSTREAM-EXTRA-DATA` header carries the single value verbatim when only one field is populated (any JSON shape or arbitrary string — neither field alone is required to be a JSON object), or the shallow top-level merge when both are present — in that case each value must parse as a JSON object, otherwise the write is rejected with a validation error, same as overlapping top-level keys. Full design + alternatives: full-version [`04-security-and-audit.md`](../dial-unified-config/04-security-and-audit.md) §2.11; implementation staged as slice U.5 (blocked on the decision).
65+
6466
**Backward compatibility with config files** — file-sourced entities continue to carry plaintext secrets. `MergedConfigStore` transparently handles four formats: plaintext (file/dev), `ENC[...]` (this proposal), `${SECRET:...}` (future vault references — Phase 5+), and the existing bare-Base64 `ResourceAuthSettings` payloads on toolsets.
6567

6668
## Audit *(deferred — Phase 7)*

docs/sandbox/dial-unified-config/04-security-and-audit.md

Lines changed: 39 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -245,7 +245,7 @@ Several entity types contain secret fields (API keys, provider tokens, OAuth sec
245245
|--------|-------|:-:|:-:|---|
246246
| `Key` | `key` || No (plaintext in config file / map key) | Platform API key secret. Highest risk. Depends on the OQ-12 key-model fix. API-managed keys encrypted via the new `SecretFieldProcessor`; file-sourced keys stay as-is by design. |
247247
| `Upstream` | `key` || No | Provider API tokens (OpenAI, Anthropic, etc.). |
248-
| `Upstream` | `extraData` | ✅ | No | Entire JSON value encrypted as a single string. The in-memory `Java String` field value (e.g. `{"region":"us-east-1"}`) is what gets encrypted — `extraData` is already a JSON-as-string Java field before encryption (see `JsonToStringDeserializer` in [`02-architecture.md`](02-architecture.md) §8). On `?reveal_secrets=true`, the decrypted Java `String` is returned as-is in the JSON response body — it appears as a JSON string containing escaped JSON, not as an embedded object. No per-field carve-outs inside `extraData`. May contain AWS `secret_access_key` (Bedrock IAM credentials) but for region-only Bedrock upstreams it carries non-secret data like `{"region":"us-east-1"}`. **Hard invariant — no blob-write path bypasses `SecretFieldProcessor`.** On any write path that uses the blob-I/O `ObjectMapper` for entities containing `Upstream.extraData` (i.e. every `MergedConfigStore`-managed write — see §2.3 / §2.5), `SecretFieldProcessor` MUST run before serialization. There is no code path that writes `Upstream.extraData` to blob without encryption — the blob-I/O serialization step assumes the in-memory value is already `"ENC[..."` ciphertext (or an explicit `${SECRET:...}` reference). Phase 2 test requirement: a write attempt that hands a non-`ENC[`-prefixed `extraData` value to the blob mapper must demonstrably go through `SecretFieldProcessor` (which produces the `ENC[...]` string) before the mapper sees it; an integration test must assert the blob never contains a plaintext `extraData` payload from a `MergedConfigStore` write. Serialization-path details (the blob-I/O `BeanSerializerModifier`, `JsonToStringDeserializer` interaction, and the rationale against a class-level `@JsonSerialize`) are in [`02-architecture.md`](02-architecture.md) §8. **Operator visibility consequence.** Even when `extraData` carries no secret (region-only case), the persisted value is `ENC[...]` — operators inspecting via the Owner-view API see `"***"` and must use `?reveal_secrets=true` (security-admin role, §2.6) to read the region. This is a deliberate trade-off in favor of "always encrypted, no per-upstream-type carve-outs"; if review feedback indicates the region-only ergonomics are painful enough to address, future work could move `region` to a separate non-encrypted field on `Upstream`. |
248+
| `Upstream` | `extraData` | ✅ | No | Entire JSON value encrypted as a single string. The in-memory `Java String` field value (e.g. `{"region":"us-east-1"}`) is what gets encrypted — `extraData` is already a JSON-as-string Java field before encryption (see `JsonToStringDeserializer` in [`02-architecture.md`](02-architecture.md) §8). On `?reveal_secrets=true`, the decrypted Java `String` is returned as-is in the JSON response body — it appears as a JSON string containing escaped JSON, not as an embedded object. No per-field carve-outs inside `extraData`. May contain AWS `secret_access_key` (Bedrock IAM credentials) but for region-only Bedrock upstreams it carries non-secret data like `{"region":"us-east-1"}`. **Hard invariant — no blob-write path bypasses `SecretFieldProcessor`.** On any write path that uses the blob-I/O `ObjectMapper` for entities containing `Upstream.extraData` (i.e. every `MergedConfigStore`-managed write — see §2.3 / §2.5), `SecretFieldProcessor` MUST run before serialization. There is no code path that writes `Upstream.extraData` to blob without encryption — the blob-I/O serialization step assumes the in-memory value is already `"ENC[..."` ciphertext (or an explicit `${SECRET:...}` reference). Phase 2 test requirement: a write attempt that hands a non-`ENC[`-prefixed `extraData` value to the blob mapper must demonstrably go through `SecretFieldProcessor` (which produces the `ENC[...]` string) before the mapper sees it; an integration test must assert the blob never contains a plaintext `extraData` payload from a `MergedConfigStore` write. Serialization-path details (the blob-I/O `BeanSerializerModifier`, `JsonToStringDeserializer` interaction, and the rationale against a class-level `@JsonSerialize`) are in [`02-architecture.md`](02-architecture.md) §8. **Operator visibility consequence.** Even when `extraData` carries no secret (region-only case), the persisted value is `ENC[...]` — operators inspecting via the Owner-view API see `"***"` and must use `?reveal_secrets=true` (security-admin role, §2.6) to read the region. This is a deliberate trade-off in favor of "always encrypted, no per-upstream-type carve-outs"; if review feedback indicates the region-only ergonomics are painful enough to address, future work could move `region` to a separate non-encrypted field on `Upstream`. **Proposed amendment (2026-06-03, pending core-team decision — [OQ-34](08-open-questions-and-references.md)):** split into plaintext `extraData` + encrypted `secretExtraData` with merge-at-consumption — full design in §2.11. |
249249
| `ResourceAuthSettings` | `clientSecret` || **Yes** — by `ResourceAuthSettingsEncryptionService.processFields()` invoked from `ToolSetService.putToolSet()` before Jackson serialization; uses `CredentialEncryptionService` under the hood. | OAuth client secret. Already encrypted at rest — no new code needed. The new `SecretFieldProcessor` does not touch this field (the existing bespoke path stays); if a future unification is desired, it becomes a refactor to add `@EncryptedField` here and retire `ResourceAuthSettingsEncryptionService`, but that is out of scope for Phase 2–3. |
250250
| `ResourceAuthSettings` | `codeVerifier` | ❌ | **No** — plain `String` field, serialized verbatim by Jackson in `ToolSetService.putToolSet()`. | PKCE verifier. Plaintext in blob today. Encrypted in Phase 3 by extending `ToolSetService.putToolSet()` to invoke the existing `ResourceAuthSettingsEncryptionService` on this field — the same path that already encrypts `clientSecret`. The `@EncryptedField` / `SecretFieldProcessor` route does **not** fire for toolsets (toolsets are not routed through `MergedConfigStore` per §6 / §8 — the dual-mapper write path doesn't apply), so reusing the bespoke service is the only path that actually executes on the toolset write. **Lazy migration of legacy plaintext blobs.** Existing toolset blobs already in production carry `codeVerifier` as plaintext (no `Base64`-shaped ciphertext), so a naive `decryptValue()` invocation on read would throw `IllegalArgumentException` from `Base64.getDecoder().decode()`. Phase 3 must therefore extend `ResourceAuthSettingsEncryptionService.processFields()` (which today only handles `clientSecret` — see `ResourceAuthSettingsEncryptionService.processFields()` body) to also process `codeVerifier`, and the read path must guard the decode: attempt Base64 decode + decrypt, and if the value does not look like valid Base64 ciphertext (catch `IllegalArgumentException` from the decoder, or guard via an `isProbablyBase64(value)` precheck), treat the value as legacy plaintext, return it as-is, and re-encrypt on the next write. This mirrors the legacy-plaintext handling pattern used by `SecretFieldProcessor` for the `ENC[`/`${SECRET:`/plaintext branches. |
251251

@@ -370,6 +370,44 @@ String resolveSecret(String fieldValue, ResourceDescriptor resource) {
370370

371371
See `dial_secrets_storage_analysis.md` for the full evaluation of alternative approaches (document-level encryption, secret references/indirection) and their trade-offs.
372372

373+
### 2.11 Proposed: `extraData` duality split *(2026-06-03 — pending core-team decision; tracked as [OQ-34](08-open-questions-and-references.md))*
374+
375+
> **STATUS: PROPOSED, NOT LOCKED.** This subsection records a design proposed in response to a DIAL-core-team concern; nothing in it amends §2.2 / §2.4 / §2.5 until the core team accepts it. If accepted, it reverses the 2026-05-01 "encrypt `extraData` wholesale, no per-field carve-outs" lock and the slice-U.4 sub-decision that added `@JsonProperty(WRITE_ONLY)` to `Upstream.extraData`. Implementation is staged as slice U.5 ([`IMPLEMENTATION.md`](IMPLEMENTATION.md) §5.5, ⏸ blocked on this decision).
376+
377+
**Problem.** `Upstream.extraData` is a *wide* property: in the common case it carries non-secret data (region-only Bedrock upstreams — `{"region":"us-east-1"}`), in the less common case secrets (Bedrock IAM `secret_access_key`), and the two can mix *inside one JSON document*. The locked design encrypts the whole value (§2.2) and — after U.4 retired the reveal flow — drops it from every GET via `WRITE_ONLY`, so there is **no API path to read even a non-secret `extraData`**. Operators cannot inspect or debug the common case. Two aggravating findings from the implementation:
378+
379+
- **Plaintext logging on the hot path.** The value is sent wholesale as the `X-UPSTREAM-EXTRA-DATA` header and logged at info level (`DeploymentPostController` ~lines 270/341, `ResponsesController` ~204/333, `BaseDeploymentPostController` ~142) — a secret inside `extraData` lands in plaintext logs today.
380+
- **User-facing regression.** `Upstream` is shared with the pre-existing `/v1/applications` / `/v1/toolsets` responses; U.4's `WRITE_ONLY` removed `extraData` from those surfaces too.
381+
382+
**Proposed design — field split with merge-at-consumption.**
383+
384+
| Field | Annotations | At rest | GET (Owner) | PUT semantics |
385+
|---|---|---|---|---|
386+
| `extraData` (existing) | drop `@EncryptedField`, drop `WRITE_ONLY`; keep `JsonToStringDeserializer` | plaintext | visible | standard full-replace (leaves the §2.5 preserve-on-omit list) |
387+
| `secretExtraData` (new) | `@EncryptedField` + `@JsonProperty(WRITE_ONLY)` + `JsonToStringDeserializer` + `@ToString.Exclude` | `ENC[...]` | dropped | preserve-on-omit (§2.5) |
388+
389+
`secretExtraData` inherits today's secret contract verbatim — `SecretFieldProcessor`, dual-mapper, the blob-I/O `ToStringSerializer` modifier, and preserve-on-omit all apply with no new crypto code. Neither field is ever on the Public allowlist (§1.5 unchanged — `extraData` stays Owner-only).
390+
391+
**Merge contract (`X-UPSTREAM-EXTRA-DATA` header).** The adapter keeps receiving exactly one value:
392+
393+
- **One field populated** (the dominant case): the value passes to the header **verbatim** — any JSON shape or arbitrary string is preserved, byte-for-byte today's behavior. (`extraData` is *not* guaranteed to be a JSON object: `JsonToStringDeserializer` accepts any token — scalar strings and numbers are pinned valid by `JsonToStringDeserializerTest` — and the custom-application schema types `dial:extraData` as plain `"string"`.)
394+
- **Both fields populated** (the mixed case): both values MUST parse as JSON objects; the header carries their shallow top-level merge. Overlapping top-level keys are a **write-time validation error** (422) — no silent precedence. The constraint binds only operators who opt into mixing, and mixing structurally implies a keyed document anyway.
395+
396+
**Log hygiene.** Hot-path log statements keep printing `getExtraData()` — the non-secret part only; the merged (secret-bearing) value exists solely for the header. This closes the pre-existing plaintext-secret-in-logs leak as a side effect of the split.
397+
398+
**Use-case coverage.** Region-only upstream → `extraData` alone (visible, debuggable, no constraint). Fully-secret value (even a non-object opaque token) → `secretExtraData` alone (encrypted, verbatim passthrough). Mixed document → operator splits keys across the two fields; the adapter sees the merged whole.
399+
400+
**Compatibility.** File-sourced upstreams are untouched (plaintext file; `secretExtraData` is usable there too under the §2.7 passthrough rules). The feature branch is pre-release, so no migration path is required for wholesale-`ENC[` `extraData` blobs beyond the prefix-tolerant read `SecretFieldProcessor` already performs; slice U.5 restores the pre-U.4 user-facing serialization of `extraData` on `/v1/applications` / `/v1/toolsets`.
401+
402+
**Alternatives considered.**
403+
404+
- **(B) Per-leaf marker-driven encryption inside the single field** — parse `extraData` as JSON and encrypt only values carrying a write-side marker. Handles mixing without a second field but requires JSON-aware processing in `SecretFieldProcessor`, a marker syntax, **per-key** preserve-on-omit on GET→PUT round-trips, and non-object edge-case handling. Rejected under IMPLEMENTATION.md §2.1 Simplicity First — the split reuses the existing wholesale contract unchanged.
405+
- **(D) Per-upstream boolean flag** (`extraDataSecure: true|false`) — all-or-nothing per value; cannot split a mixed document. Rejected once the intra-document use case was confirmed real.
406+
407+
**Generalization note (`@EncryptedField` at large).** The plaintext/secret **sibling-pair** is proposed as the standard pattern for any future wide dual-nature field — `@EncryptedField` stays a binary marker ("this field is always a secret"); duality is expressed in the schema, not in the annotation. A write-side value marker (e.g. `${ENCRYPT:...}` → encrypted on write, per string leaf, on any field) is sketched as a possible future track for per-leaf opt-in without schema changes, but it carries option-(B)'s machinery cost and is explicitly deferred.
408+
409+
**Doc syncs on acceptance** (not before): §2.2 `extraData` row, §2.4 example, §2.5 preserve-on-omit list, summary checklist; [`02-architecture.md`](02-architecture.md) §8 (the `extraData` string round-trip invariant narrows to `secretExtraData`); [`03-api-reference.md`](03-api-reference.md) §2 example payload + §3.1 secret-field list; lite `03-api.md` / `04-security.md`; OQ-34 closes into the resolved register.
410+
373411
---
374412

375413
## 3. Audit

0 commit comments

Comments
 (0)