New Catalog Configuration Credentials#255
Conversation
789bb11 to
59242ec
Compare
adca49d to
c575a4e
Compare
c575a4e to
34d2a8c
Compare
34d2a8c to
f671ae1
Compare
bec4277 to
171a037
Compare
| /* ProcessUtility handler: protects extension-owned catalog servers */ | ||
| /* ProcessUtility handlers */ | ||
| extern PGDLLEXPORT bool ProtectExtensionCatalogServersHandler(ProcessUtilityParams *processUtilityParams, void *arg); | ||
| extern PGDLLEXPORT bool ScrubIcebergUserMappingHandler(ProcessUtilityParams *processUtilityParams, void *arg); |
There was a problem hiding this comment.
nit: RedactRestCatalogSecretsHandler and need to move to pg_lake_table extension
There was a problem hiding this comment.
RedactRestCatalogUserMappingSecrets
| """Credentials should be resolved from $PGDATA/catalogs.conf when no | ||
| user mapping exists.""" | ||
| catalogs_conf( | ||
| "test_conf_srv.client_id = 'conf-id'\n" |
There was a problem hiding this comment.
we would use conf file only for extension owned catalogs, right? (assuming snowflake ui wont enable it for user catalogs) then might be good also add rest.client_id = '' to the test.
There was a problem hiding this comment.
good point: right now the tests only use user-created server names in catalogs.conf - they're testing a path that won't happen in practice. But the point is to test the path to make sure it works for the "platform owned" catalogs. I assume both extension owned "rest" catalog and platform owned catalog will end up getting credentials from conf file, so will add the test you suggested
| /* catalogs.conf overrides GUCs but not user mapping */ | ||
| char *confClientId = NULL; | ||
| char *confClientSecret = NULL; | ||
| char *confScope = NULL; | ||
|
|
||
| if (ReadCatalogsConfCredentials(serverName, | ||
| &confClientId, &confClientSecret, | ||
| &confScope)) |
There was a problem hiding this comment.
better to move this above usermapping even if we have null checks. (user mapping should be checked the last to override all)
74860c2 to
9b17dc3
Compare
baf1541 to
2c7435f
Compare
ba29db5 to
dfd7536
Compare
dfd7536 to
36b145f
Compare
| } | ||
|
|
||
|
|
||
| #define CATALOGS_CONF_FILENAME "catalogs.conf" |
There was a problem hiding this comment.
it could be nice to move this out of the database directory, such that secrets don't end up in backups. Maybe we need a setting here.
There was a problem hiding this comment.
Makes sense, you mean a GUC_SUPERUSER_ONLY that defaults to $PGDATA/catalogs.conf but can be set to any absolute path
There was a problem hiding this comment.
I am not sure if the default should point to $pgdata. I think it would be better to set it empty by default.
b5192a9 to
3aa4348
Compare
2c7435f to
0b63f97
Compare
| if (def->location < 0) | ||
| continue; | ||
|
|
||
| char *p = (char *) queryString + def->location; |
There was a problem hiding this comment.
minor: would be nice to use currentChar instead of p, much easier to search for
There was a problem hiding this comment.
do these need to be removed?
There was a problem hiding this comment.
Yes, I noted in the PR description that this branch is not yet updated with the latest changes in "Create server" branch.
3aa4348 to
75c3869
Compare
75c3869 to
26cf127
Compare
bff4115 to
06c28a2
Compare
| if (!IsIcebergCatalogServer(serverName)) | ||
| return false; | ||
|
|
||
| ScrubUserMappingSecrets(processUtilityParams->queryString, options); |
There was a problem hiding this comment.
nit: redact sounds more common than scrub in this context.
| static List * | ||
| LookupUserMappingOptions(Oid serverid) | ||
| { | ||
| HeapTuple tp; |
There was a problem hiding this comment.
postgres already have UserMapping * GetUserMapping(Oid userid, Oid serverid). Can we reuse it?
There was a problem hiding this comment.
GetUserMapping(userid, serverid) raises an ERROR if no mapping is found for the user or PUBLIC. LookupUserMappingOptions(serverid) allows the caller to gracefully fall through to lower-priority credential sources.
| char **clientId, char **clientSecret, | ||
| char **scope) |
There was a problem hiding this comment.
RestCatalogOptions struct can have a substruct called RestCatalogSecrets which contains user mapping options.
There was a problem hiding this comment.
hmm, we could technically have multiple user mappings per server, not sure if I'd go with a substruct.
Let me know if you feel strongly about this.
92f71ed to
f1bb78f
Compare
61b714f to
507450b
Compare
5e2d2a1 to
70891ec
Compare
a0e5ae4 to
87dbbc1
Compare
Until now an iceberg_catalog server carried its own client_id /
client_secret on the SERVER row. That gave every role on the
cluster the same credentials and forced operators to recreate the
server when secrets rotated. Move credentials to per-user state
and add a platform-provided file as a middle layer.
Credential resolution (lowest to highest priority):
1. pg_lake_iceberg.rest_catalog_* GUC defaults
2. iceberg_catalog SERVER options (non-secret only)
3. $PGDATA/catalogs.conf (user-created servers only)
4. pg_user_mapping options (user-created servers only)
The built-in pg_lake_rest_catalog (catalog='rest') stops after step 2
on purpose: its credentials live exclusively in the GUCs so the
built-in stays a single, global, instance-wide configuration with no
hidden per-user view. CREATE/ALTER USER MAPPING is rejected against
all three built-in long names; catalogs.conf is also ignored for the
built-in 'rest' catalog.
Notable pieces:
* iceberg_catalog_option_descs[] now carries a CATALOG_OPT_CTX_*
bitmask per option. The same table drives the validator, the
per-context "Valid options are: ..." hint, and the option->struct
applier. client_id / client_secret are USER MAPPING-only; scope is
accepted on both, with the USER MAPPING value winning because it is
applied last during resolution.
* RestCatalogOptions gains a umid field; the token cache key is now
(serverOid, umid) so different SET ROLEs in the same backend each
get their own user mapping's credentials. A USERMAPPINGOID syscache
callback invalidates cached tokens on CREATE/ALTER/DROP USER MAPPING.
* ValidateRestCatalogOptions now performs an early auth-type-aware
credentials check at resolution time:
- client_secret is always required
- client_id is required unless rest_auth_type='horizon'
Missing credentials surface as "no credentials found for REST
catalog ..." with ERRCODE_FDW_OPTION_NAME_NOT_FOUND. The per-field
checks inside FetchRestCatalogAccessToken are kept as defense in
depth and now carry the same errcode.
* New ProcessUtility handler RedactRestCatalogUserMappingSecrets
scrubs client_id / client_secret from queryString in place on
CREATE/ALTER USER MAPPING for any iceberg_catalog server (built-in
long names included). Handles plain '', E'', and U&'' literal
forms with their escape rules. Registered after the DDL validator
so it runs first (the handler list is prepend-LIFO), ensuring the
failing built-in-server path never leaks secrets into the ereport
context. DDL itself reads option values from DefElem->arg, so
pg_user_mapping still stores plaintext credentials -- only the
query string surfaces (pg_stat_statements, log_min_duration_statement,
ereport context) see the redacted form.
* New PGC_SIGHUP GUC pg_lake_iceberg.catalogs_conf_path lets
operators point at an absolute path; the default is the relative
'catalogs.conf' resolved against DataDir.
Tests:
* test_iceberg_catalog_server.py picks up the user-mapping DDL
surface (per-context option lists and hints, valid/invalid options
on each side, FOR CURRENT_USER with all three options,
per-role mappings on the same server, dependency-driven rejections,
built-in long-name blocks), the redaction handler (CREATE/ALTER,
'' escape, E'' escape, scope preservation, non-iceberg FDW skip,
built-in-rejection-after-redaction, plaintext storage preservation),
and credential resolution via catalogs.conf (file-only success,
USER MAPPING wins over the file, no-credentials-anywhere error,
scope from the file, absolute catalogs_conf_path, built-in
ignores the file).
* test_modify_iceberg_rest_table.py: client_id / client_secret are
dropped from the server-option-overrides-GUC parametrization (they
no longer belong on SERVER), and a new parametrized
test_user_mapping_credential_overrides_guc covers the same
resolution-order direction with USER MAPPING in step 4.
* test_writable_iceberg_common.py: the shared fixture for the writable
user-created REST server now puts credentials on a PUBLIC user
mapping, and drops the server with CASCADE to sweep up the mapping.
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
507450b to
8ed0b8e
Compare
|
|
||
| if (!IsBuiltinCatalogServerName(serverName)) | ||
| { | ||
| ApplyCatalogsConfOverrides(opts, serverName); |
There was a problem hiding this comment.
I'm surprised the config file overrides the server-level options. Is that intentional?
| &clientId, &clientSecret, &scope)) | ||
| return; | ||
|
|
||
| if (clientId != NULL) |
There was a problem hiding this comment.
I find it a bit unintuitive to guess which server settings are configurable via catalogs.json. Is it useful to allow all fo them?
Description
This change moves credential storage (
client_id,client_secret) out ofCREATE SERVERoptions where they are publicly readable, into two secure mechanisms:CREATE USER MAPPING— per-user credentials stored inpg_user_mapping, providing user-level isolation.catalogs.conf— a file-based credential store for platform-provided catalogs, inaccessible via SQL. The path is configurable viapg_lake_iceberg.catalogs_conf_path.GUCs remain as a final fallback for backward compatibility.
Credential Resolution Order
GetRestCatalogConnectionFromServernow resolves credentials in two phases:Phase 1 — Non-secret server options (unchanged from prior PR): server options override GUCs for
rest_endpoint,scope,rest_auth_type,oauth_endpoint,enable_vended_credentials.Phase 2 — Credentials (
client_id,client_secret) andscope:CREATE USER MAPPINGpg_user_mappingsyscache lookup for current user, fallback to PUBLIC$PGDATA/catalogs.confParseConfigFp; dotted keys likeserver.client_id = '...'rest_catalog_client_id,rest_catalog_client_secret(set during initialization)User mapping values overwrite directly.
catalogs.confvalues fill in only what the user mapping didn't provide. GUCs are the implicit fallback from initialization. An error is raised if no credentials are found after all three sources.scopeis accepted in both server options and user mapping options. The effective priority is: user mapping > catalogs.conf > server options > GUC.LookupUserMappingOptionsNew static function that looks up
pg_user_mappingvia the syscache. It checks for the current user first (GetUserId()), then falls back to PUBLIC (InvalidOid). Returns the untransformed option list orNILif no mapping exists. This avoids theERRORthat PostgreSQL'sGetUserMappingraises when no mapping is found.ReadCatalogsConfCredentialsNew static function that reads the catalog credentials file using PostgreSQL's
ParseConfigFp. The file path is controlled bypg_lake_iceberg.catalogs_conf_path(defaults tocatalogs.conf, resolved to$PGDATA/catalogs.conf). Absolute paths are used as-is. The file uses standardkey = valueformat with dotted keys:The function matches entries where the key prefix equals the server name. It re-reads the file on each call for simplicity - these lookups happen once per REST catalog operation, not per row. Returns
falseif the file doesn'texist (ENOENT is silently ignored).
Query String Redacting
Credentials in
CREATE USER MAPPINGandALTER USER MAPPINGDDL appear in plaintext inpg_stat_statements. To mitigate this, a newProcessUtilityhandler (RedactRestCatalogUserMappingSecrets) redactsclient_idandclient_secretvalues in the query string in-place before the statement is passed down the hook chain.How it works
RedactRestCatalogUserMappingSecretslocates each secret option in the query string usingDefElem.location(the source position recorded by the parser), finds the quoted value, and replaces every character between the quotes with*. It handles''escape sequences. Non-secret options (likescope) are left untouched. The actual DDL execution is unaffected because PostgreSQL reads option values from the parse tree (DefElemnodes), not fromqueryString.RedactRestCatalogUserMappingSecretsis registered inpg_lake_table's_PG_initviaRegisterUtilityStatementHandler.Validator Changes
The
iceberg_catalog_validatornow distinguishes between server and user mapping contexts:rest_endpoint,scope,rest_auth_type,oauth_endpoint,enable_vended_credentials,location_prefix,catalog_name.client_id,client_secret,scope.The internal helper was refactored from
is_valid_iceberg_catalog_optiontois_valid_option_in_listwhich takes the option list as a parameter, reused for both server and user mapping validation.Test Coverage
The test file (
test_iceberg_catalog_server.py) was extended with tests covering user mapping validation, credential resolution & query string redacting. Acatalogs_confpytest fixture handles creating/restoring$PGDATA/catalogs.confduring tests. Existing tests that usedclient_id/client_secretinCREATE SERVER OPTIONSwere updated to useCREATE USER MAPPINGinstead.Fixes remaining part of #230
Checklist