Skip to content

Commit 941882a

Browse files
EDsCODEclaude
andauthored
fix(usersecrets): redact secret DDL anywhere in a multi-statement query (#769)
* fix(usersecrets): redact secret DDL anywhere in a multi-statement query RedactForLog only classified the statement head, so a client could prefix any leading statement (e.g. "SELECT 1; CREATE PERSISTENT SECRET foo (..., SECRET 'realKey')") and the head no longer classified as secret DDL — the query, including credential literals, was returned verbatim and written to logs, OTel spans, the persisted query log, and pg_stat_activity. This violates the "Never log/store secret statement text" invariant. RedactForLog now splits the query on top-level semicolons (reusing the existing literal/identifier/comment-aware tokenization, not a naive strings.Split) and, if ANY top-level segment classifies as CREATE SECRET, replaces the whole query with the fixed redaction placeholder. The single-statement fast path is unchanged. Over-redaction is intentional: false positives only cost log fidelity, never credential exposure. Also fixes the same head-only blind spot in the Flight SQL ingress RejectPersistentSecretDDL check via a new ContainsPersistentSecretDDL helper that scans every top-level statement. Generated-By: PostHog Code Task-Id: 96557416-3af0-4633-bcf0-164d1e64cf22 * fix(usersecrets): redact secret literals echoed in error messages RedactForLog scrubs the query attribute, but engine error messages echo the offending SQL verbatim (DuckDB: `LINE 1: ... SECRET 'literal'`), so a failed CREATE SECRET leaked the credential through the unredacted error sinks even though the query text was redacted: msg="Query execution errored." query="CREATE TEMPORARY SECRET badt (…redacted)" error="...Parser Error: ... SECRET 'realkey' BOGUS)" This violated the same "never log/store secret statement text" invariant the PR set out to enforce, just via the error channel instead of the query channel. Reproduced live against the multi-tenant cluster. Add usersecrets.RedactErrorForLog(query, errMsg): when the originating query carries CREATE SECRET DDL (head or any top-level statement, shared tokenizer with RedactForLog so they can't drift), the whole error is replaced with a fixed placeholder. Over-redaction only costs diagnostic detail, never credential exposure; non-secret errors pass through. Apply it at every sink that logs a query's error: logQueryError and logQueryFinished (slog `error` attr) and logQuery (query-log `Exception` column), each classifying against the original query before it is replaced with the redacted form. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
1 parent e28ec9d commit 941882a

7 files changed

Lines changed: 387 additions & 19 deletions

File tree

CLAUDE.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -272,7 +272,13 @@ Invariants for anyone touching this path:
272272
false "DROP succeeded" is fatal for a credential revocation.
273273
- **Never log/store secret statement text.** `usersecrets.RedactForLog` guards
274274
logQueryStarted/Finished/Error, the query log, spans, and pg_stat_activity
275-
(`currentQuery`); keep new logging of query text behind it.
275+
(`currentQuery`); keep new logging of query text behind it. Engine **error
276+
messages echo the offending SQL** (DuckDB emits `LINE 1: ... SECRET '...'`),
277+
so a failed CREATE SECRET leaks the credential via the `error` attribute /
278+
query-log `Exception` even when the query attribute is redacted —
279+
`usersecrets.RedactErrorForLog(query, errMsg)` guards those error sinks
280+
(logQueryError/logQueryFinished, `logQuery`); keep new error logging behind it
281+
too, and pass the original (un-redacted) query so it can classify.
276282
- Touching the interception, wipe/replay, or payload shape → update
277283
`server/conn_user_secrets_test.go`, `duckdbservice/user_secrets_test.go`,
278284
and the `persistent_user_secret`(+`_isolation`) assertions in

server/conn.go

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -419,15 +419,16 @@ func (c *clientConn) logQueryStarted(query string) {
419419
// "started" and a "finished" line, and can look at the separate error
420420
// line for severity context.
421421
func (c *clientConn) logQueryFinished(query string, start time.Time, rows int64, err error) {
422-
query = usersecrets.RedactForLog(query)
423422
attrs := []any{
424-
"query", query,
423+
"query", usersecrets.RedactForLog(query),
425424
"duration_ms", time.Since(start).Milliseconds(),
426425
"rows", rows,
427426
"trace_id", observe.TraceIDFromContext(c.ctx),
428427
}
429428
if err != nil {
430-
attrs = append(attrs, "error", err.Error())
429+
// Engine errors echo the offending SQL, so a failed CREATE SECRET
430+
// leaks the credential here unless the error is redacted too.
431+
attrs = append(attrs, "error", usersecrets.RedactErrorForLog(query, err.Error()))
431432
}
432433
c.logger().Info("Query finished.", attrs...)
433434
}
@@ -439,8 +440,13 @@ func (c *clientConn) logQueryFinished(query string, start time.Time, rows int64,
439440
// (worker crash, IO failure, internal panic, infra unreachable), not
440441
// "user typo'd a column name."
441442
func (c *clientConn) logQueryError(query string, err error) {
442-
query = usersecrets.RedactForLog(query)
443-
attrs := []any{"query", query, "error", err}
443+
// Engine errors echo the offending SQL, so a failed CREATE SECRET leaks the
444+
// credential via the error attribute unless it is redacted too. Classify
445+
// against the original query before it is replaced with the redacted form.
446+
attrs := []any{
447+
"query", usersecrets.RedactForLog(query),
448+
"error", usersecrets.RedactErrorForLog(query, err.Error()),
449+
}
444450
if isDuckLakeTransactionConflict(err) {
445451
c.logger().Warn("DuckLake transaction conflict.", attrs...)
446452
return

server/flightsqlingress/ingress.go

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -331,8 +331,7 @@ func (h *ControlPlaneFlightSQLHandler) checkUserSecretDDL(query string) error {
331331
if !h.rejectPersistentSecretDDL {
332332
return nil
333333
}
334-
st := usersecrets.Classify(query)
335-
if st.Kind != usersecrets.KindNone && st.Persistent {
334+
if usersecrets.ContainsPersistentSecretDDL(query) {
336335
return status.Error(codes.InvalidArgument,
337336
"persistent secrets are managed via the PostgreSQL protocol on this deployment; CREATE/DROP PERSISTENT SECRET is not supported over Flight SQL (a secret created here would not survive the session)")
338337
}

server/querylog.go

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -496,7 +496,10 @@ func (c *clientConn) logQuery(start time.Time, query, transpiledQuery, cmdType s
496496
}
497497

498498
// CREATE SECRET option lists carry credential material; never persist
499-
// them to the query log.
499+
// them to the query log. The engine's error text echoes the offending SQL,
500+
// so a failed CREATE SECRET leaks the credential via Exception unless the
501+
// error is redacted too — classify against the original query first.
502+
errMsg = usersecrets.RedactErrorForLog(query, errMsg)
500503
query = usersecrets.RedactForLog(query)
501504
transpiledQuery = usersecrets.RedactForLog(transpiledQuery)
502505

server/usersecrets/classify.go

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,27 @@ func Classify(query string) Statement {
8989
return st
9090
}
9191

92+
// ContainsPersistentSecretDDL reports whether any top-level statement in query
93+
// is CREATE/DROP PERSISTENT SECRET. Unlike Classify (which inspects only the
94+
// statement head), this scans every top-level statement, so a persistent-secret
95+
// DDL hidden behind a leading statement ("SELECT 1; CREATE PERSISTENT SECRET
96+
// ...") is still caught. Used to reject persistent-secret DDL on the Flight SQL
97+
// ingress, where it would execute but never persist.
98+
func ContainsPersistentSecretDDL(query string) bool {
99+
if st, _, ok := parseSecretDDLHead(query); ok && st.Persistent {
100+
return true
101+
}
102+
if !hasTrailingStatement(query) {
103+
return false
104+
}
105+
for _, seg := range splitTopLevel(query) {
106+
if st, _, ok := parseSecretDDLHead(seg); ok && st.Persistent {
107+
return true
108+
}
109+
}
110+
return false
111+
}
112+
92113
// parseSecretDDLHead parses the statement head (through the optional secret
93114
// name) and returns the classification plus the byte offset just past the
94115
// head. The fast path for non-secret statements is two short case-folded
@@ -234,6 +255,58 @@ func hasTrailingStatement(query string) bool {
234255
return false
235256
}
236257

258+
// splitTopLevel splits query on top-level semicolons (outside string literals,
259+
// quoted identifiers, and comments), using the same scanning rules as
260+
// hasTrailingStatement so it stays in sync with the rest of this package. The
261+
// returned segments do NOT include the separating semicolons; a trailing empty
262+
// segment (query ending in ';') is omitted.
263+
func splitTopLevel(query string) []string {
264+
var segments []string
265+
start := 0
266+
i := 0
267+
for i < len(query) {
268+
c := query[i]
269+
switch {
270+
case c == '\'':
271+
i = skipQuoted(query, i, '\'')
272+
case c == '"':
273+
i = skipQuoted(query, i, '"')
274+
case c == '-' && strings.HasPrefix(query[i:], "--"):
275+
idx := strings.IndexByte(query[i:], '\n')
276+
if idx < 0 {
277+
i = len(query)
278+
} else {
279+
i += idx + 1
280+
}
281+
case c == '/' && strings.HasPrefix(query[i:], "/*"):
282+
depth := 1
283+
i += 2
284+
for i < len(query) && depth > 0 {
285+
switch {
286+
case strings.HasPrefix(query[i:], "/*"):
287+
depth++
288+
i += 2
289+
case strings.HasPrefix(query[i:], "*/"):
290+
depth--
291+
i += 2
292+
default:
293+
i++
294+
}
295+
}
296+
case c == ';':
297+
segments = append(segments, query[start:i])
298+
i++
299+
start = i
300+
default:
301+
i++
302+
}
303+
}
304+
if start < len(query) {
305+
segments = append(segments, query[start:])
306+
}
307+
return segments
308+
}
309+
237310
// skipQuoted returns the index just past a quoted region starting at start
238311
// (where query[start] == quote). Doubled quotes are escapes.
239312
func skipQuoted(query string, start int, quote byte) int {

server/usersecrets/redact.go

Lines changed: 82 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,94 @@ package usersecrets
22

33
import "strings"
44

5+
// redactedPlaceholder replaces a whole query whose credential material cannot
6+
// be safely located and stripped in place (a CREATE SECRET that is not the
7+
// statement head of the string).
8+
const redactedPlaceholder = "(…redacted)"
9+
510
// RedactForLog returns a version of query safe to write to logs, traces,
6-
// query logs, and pg_stat_activity. For CREATE SECRET statements (any
7-
// persistence, including multi-statement strings whose head is secret DDL)
8-
// everything after the statement head is dropped — the option list carries
9-
// credential material, and on a multi-statement string the trailing
10-
// statements are dropped along with it. DROP variants carry only a name and
11-
// pass through unchanged, as does every non-secret statement.
11+
// query logs, and pg_stat_activity. CREATE SECRET option lists carry
12+
// credential material and must never reach a log sink.
13+
//
14+
// The fast path: when the statement head is a CREATE SECRET, everything after
15+
// the head is dropped (the option list and, for a multi-statement string, the
16+
// trailing statements). DROP variants carry only a name and pass through, as
17+
// does every non-secret single statement.
18+
//
19+
// The hardened path guards against secret DDL that is NOT the statement head,
20+
// e.g. "SELECT 1; CREATE PERSISTENT SECRET foo (...)" or
21+
// "BEGIN; CREATE SECRET ...". Such a string does not classify at its head, so
22+
// the head-only check would leak it verbatim. We therefore split on top-level
23+
// semicolons and, if ANY segment classifies as a CREATE SECRET, replace the
24+
// entire query with a fixed placeholder. Over-redaction is harmless here —
25+
// false positives only cost log fidelity, never credential exposure.
1226
//
1327
// This is driven by the same tokenizer as Classify, so the redactor can never
1428
// be out of sync with the interceptor: any whitespace/comment arrangement
1529
// Classify accepts is redacted, and the non-matching fast path is two short
1630
// case-folded keyword comparisons with no allocation.
1731
func RedactForLog(query string) string {
18-
st, headEnd, ok := parseSecretDDLHead(query)
19-
if !ok || st.Kind != KindCreate {
20-
return query
32+
if st, headEnd, ok := parseSecretDDLHead(query); ok && st.Kind == KindCreate {
33+
return strings.TrimSpace(query[:headEnd]) + " " + redactedPlaceholder
34+
}
35+
36+
// Head is not a CREATE SECRET. If the query is a single top-level
37+
// statement, there is nothing more to check — the fast path already
38+
// handled it (and DROP / non-secret pass through unchanged). Only when
39+
// there are multiple top-level statements must we scan for secret DDL
40+
// hiding behind a leading statement.
41+
if queryHasCreateSecret(query) {
42+
return redactedPlaceholder
43+
}
44+
return query
45+
}
46+
47+
// redactedErrorPlaceholder replaces an error message that may echo the text of
48+
// a CREATE SECRET statement. Engines surface parser/binder/execution errors
49+
// with the offending SQL inlined (DuckDB emits e.g. `LINE 1: ... SECRET
50+
// 'literal'`), so an error raised by secret DDL can carry the credential even
51+
// though the query attribute itself was already redacted by RedactForLog.
52+
// Logging that error verbatim leaks the secret on every failed CREATE SECRET.
53+
const redactedErrorPlaceholder = "(error redacted: statement carries secret DDL)"
54+
55+
// RedactErrorForLog returns an error message safe to log/store alongside query.
56+
// When query carries CREATE SECRET DDL anywhere (head or a later top-level
57+
// statement), the engine's error text may echo the secret literal, so the whole
58+
// message is replaced with a fixed placeholder. Over-redaction only costs
59+
// diagnostic detail, never credential exposure; errors from non-secret queries
60+
// (and empty errors) pass through unchanged.
61+
//
62+
// Callers MUST pass the original, un-redacted query — classification needs the
63+
// real statement text. Pair this with RedactForLog at every query log site that
64+
// also emits an error: RedactForLog scrubs the query attribute, RedactErrorForLog
65+
// scrubs the error attribute.
66+
func RedactErrorForLog(query, errMsg string) string {
67+
if errMsg == "" {
68+
return errMsg
69+
}
70+
if queryHasCreateSecret(query) {
71+
return redactedErrorPlaceholder
72+
}
73+
return errMsg
74+
}
75+
76+
// queryHasCreateSecret reports whether query contains a CREATE SECRET at its
77+
// head or in any top-level statement. It shares the tokenizer with RedactForLog
78+
// (parseSecretDDLHead / splitTopLevel) so the query and error redactors can
79+
// never drift apart. DROP SECRET (which carries only a name) is not a match.
80+
func queryHasCreateSecret(query string) bool {
81+
if st, _, ok := parseSecretDDLHead(query); ok && st.Kind == KindCreate {
82+
return true
83+
}
84+
// A single top-level statement whose head is not CREATE SECRET cannot hide
85+
// secret DDL; only multi-statement strings need the per-segment scan.
86+
if !hasTrailingStatement(query) {
87+
return false
88+
}
89+
for _, seg := range splitTopLevel(query) {
90+
if st, _, ok := parseSecretDDLHead(seg); ok && st.Kind == KindCreate {
91+
return true
92+
}
2193
}
22-
return strings.TrimSpace(query[:headEnd]) + " (…redacted)"
94+
return false
2395
}

0 commit comments

Comments
 (0)