Skip to content

New Catalog Configuration Credentials#255

Open
sfc-gh-npuka wants to merge 1 commit into
naisila/catalog_reconfigfrom
naisila/catalog_user_mapping
Open

New Catalog Configuration Credentials#255
sfc-gh-npuka wants to merge 1 commit into
naisila/catalog_reconfigfrom
naisila/catalog_user_mapping

Conversation

@sfc-gh-npuka
Copy link
Copy Markdown
Collaborator

@sfc-gh-npuka sfc-gh-npuka commented Mar 9, 2026

Description

This change moves credential storage (client_id, client_secret) out of CREATE SERVER options where they are publicly readable, into two secure mechanisms:

  1. CREATE USER MAPPING — per-user credentials stored in pg_user_mapping, providing user-level isolation.
  2. catalogs.conf — a file-based credential store for platform-provided catalogs, inaccessible via SQL. The path is configurable via pg_lake_iceberg.catalogs_conf_path.

GUCs remain as a final fallback for backward compatibility.

Credential Resolution Order

GetRestCatalogConnectionFromServer now resolves credentials in two phases:

Phase 1 — Non-secret server options (unchanged from prior PR): server options override GUCs for rest_endpoint, scope, rest_auth_type, oauth_endpoint, enable_vended_credentials.

Phase 2 — Credentials (client_id, client_secret) and scope:

Priority Source Mechanism
1 (highest) CREATE USER MAPPING pg_user_mapping syscache lookup for current user, fallback to PUBLIC
2 $PGDATA/catalogs.conf File parsed via ParseConfigFp; dotted keys like server.client_id = '...'
3 (lowest) GUCs rest_catalog_client_id, rest_catalog_client_secret (set during initialization)

User mapping values overwrite directly. catalogs.conf values fill in only what the user mapping didn't provide. GUCs are the implicit fallback from initialization. An error is raised if no credentials are found after all three sources.

scope is accepted in both server options and user mapping options. The effective priority is: user mapping > catalogs.conf > server options > GUC.

LookupUserMappingOptions

New static function that looks up pg_user_mapping via the syscache. It checks for the current user first (GetUserId()), then falls back to PUBLIC (InvalidOid). Returns the untransformed option list or NIL if no mapping exists. This avoids the ERROR that PostgreSQL's GetUserMapping raises when no mapping is found.

ReadCatalogsConfCredentials

New static function that reads the catalog credentials file using PostgreSQL's ParseConfigFp. The file path is controlled by pg_lake_iceberg.catalogs_conf_path (defaults to catalogs.conf, resolved to $PGDATA/catalogs.conf). Absolute paths are used as-is. The file uses standard key = value format with dotted keys:

horizon.client_id = 'platform_id'
horizon.client_secret = 'platform_secret'
horizon.scope = 'PRINCIPAL_ROLE:ALL'

The function matches entries where the key prefix equals the server name. It re-reads the file on each call for simplicity - these lookups happen once per REST catalog operation, not per row. Returns false if the file doesn't
exist (ENOENT is silently ignored).

Query String Redacting

Credentials in CREATE USER MAPPING and ALTER USER MAPPING DDL appear in plaintext in pg_stat_statements. To mitigate this, a new ProcessUtility handler (RedactRestCatalogUserMappingSecrets) redacts client_id and client_secret values in the query string in-place before the statement is passed down the hook chain.

How it works

RedactRestCatalogUserMappingSecrets locates each secret option in the query string using DefElem.location (the source position recorded by the parser), finds the quoted value, and replaces every character between the quotes with *. It handles '' escape sequences. Non-secret options (like scope) are left untouched. The actual DDL execution is unaffected because PostgreSQL reads option values from the parse tree (DefElem nodes), not from queryString. RedactRestCatalogUserMappingSecrets is registered in pg_lake_table's _PG_init via RegisterUtilityStatementHandler.

Validator Changes

The iceberg_catalog_validator now distinguishes between server and user mapping contexts:

  • Server options: rest_endpoint, scope, rest_auth_type, oauth_endpoint, enable_vended_credentials, location_prefix, catalog_name.
  • User mapping options: client_id, client_secret, scope.

The internal helper was refactored from is_valid_iceberg_catalog_option to is_valid_option_in_list which takes the option list as a parameter, reused for both server and user mapping validation.

Test Coverage

The test file (test_iceberg_catalog_server.py) was extended with tests covering user mapping validation, credential resolution & query string redacting. A catalogs_conf pytest fixture handles creating/restoring $PGDATA/catalogs.conf during tests. Existing tests that used client_id/client_secret in CREATE SERVER OPTIONS were updated to use CREATE USER MAPPING instead.

Fixes remaining part of #230


Checklist

  • Redacting queryString in ProcessUtility hook
  • I have tested my changes and added tests if necessary
  • I updated documentation if needed
  • I confirm that all my commits are signed off (DCO)

@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch 2 times, most recently from 789bb11 to 59242ec Compare March 9, 2026 12:28
@sfc-gh-npuka sfc-gh-npuka linked an issue Mar 9, 2026 that may be closed by this pull request
8 tasks
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch 2 times, most recently from adca49d to c575a4e Compare March 10, 2026 12:49
@sfc-gh-npuka sfc-gh-npuka changed the base branch from naisila/catalog_reconfig to naisila/backup March 12, 2026 09:00
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch from c575a4e to 34d2a8c Compare March 12, 2026 09:00
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch from 34d2a8c to f671ae1 Compare March 12, 2026 16:19
@sfc-gh-npuka sfc-gh-npuka changed the base branch from naisila/backup to naisila/catalog_reconfig March 12, 2026 16:19
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch 2 times, most recently from bec4277 to 171a037 Compare March 12, 2026 16:47
@sfc-gh-npuka sfc-gh-npuka marked this pull request as ready for review March 13, 2026 09:15
/* ProcessUtility handler: protects extension-owned catalog servers */
/* ProcessUtility handlers */
extern PGDLLEXPORT bool ProtectExtensionCatalogServersHandler(ProcessUtilityParams *processUtilityParams, void *arg);
extern PGDLLEXPORT bool ScrubIcebergUserMappingHandler(ProcessUtilityParams *processUtilityParams, void *arg);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: RedactRestCatalogSecretsHandler and need to move to pg_lake_table extension

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RedactRestCatalogUserMappingSecrets

"""Credentials should be resolved from $PGDATA/catalogs.conf when no
user mapping exists."""
catalogs_conf(
"test_conf_srv.client_id = 'conf-id'\n"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we would use conf file only for extension owned catalogs, right? (assuming snowflake ui wont enable it for user catalogs) then might be good also add rest.client_id = '' to the test.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point: right now the tests only use user-created server names in catalogs.conf - they're testing a path that won't happen in practice. But the point is to test the path to make sure it works for the "platform owned" catalogs. I assume both extension owned "rest" catalog and platform owned catalog will end up getting credentials from conf file, so will add the test you suggested

Comment thread pg_lake_iceberg/src/rest_catalog/rest_catalog.c Outdated
Comment thread pg_lake_iceberg/src/rest_catalog/rest_catalog.c
Comment on lines +644 to +651
/* catalogs.conf overrides GUCs but not user mapping */
char *confClientId = NULL;
char *confClientSecret = NULL;
char *confScope = NULL;

if (ReadCatalogsConfCredentials(serverName,
&confClientId, &confClientSecret,
&confScope))
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to move this above usermapping even if we have null checks. (user mapping should be checked the last to override all)

@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch from 74860c2 to 9b17dc3 Compare March 14, 2026 16:28
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch from baf1541 to 2c7435f Compare March 14, 2026 16:33
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch 2 times, most recently from ba29db5 to dfd7536 Compare March 14, 2026 16:55
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch from dfd7536 to 36b145f Compare March 14, 2026 16:58
}


#define CATALOGS_CONF_FILENAME "catalogs.conf"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be nice to move this out of the database directory, such that secrets don't end up in backups. Maybe we need a setting here.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, you mean a GUC_SUPERUSER_ONLY that defaults to $PGDATA/catalogs.conf but can be set to any absolute path

Copy link
Copy Markdown
Collaborator Author

@sfc-gh-npuka sfc-gh-npuka Mar 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator

@sfc-gh-abozkurt sfc-gh-abozkurt Mar 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if the default should point to $pgdata. I think it would be better to set it empty by default.

@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch 3 times, most recently from b5192a9 to 3aa4348 Compare March 16, 2026 20:49
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch from 2c7435f to 0b63f97 Compare March 23, 2026 09:08
@sfc-gh-npuka sfc-gh-npuka changed the base branch from naisila/catalog_reconfig to naisila/backup2 March 23, 2026 09:09
if (def->location < 0)
continue;

char *p = (char *) queryString + def->location;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: would be nice to use currentChar instead of p, much easier to search for

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these need to be removed?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I noted in the PR description that this branch is not yet updated with the latest changes in "Create server" branch.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR updated

@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch from 3aa4348 to 75c3869 Compare March 25, 2026 22:17
@sfc-gh-npuka sfc-gh-npuka changed the base branch from naisila/backup2 to naisila/catalog_reconfig March 25, 2026 22:17
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch from 75c3869 to 26cf127 Compare March 25, 2026 22:18
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch 2 times, most recently from bff4115 to 06c28a2 Compare March 26, 2026 08:14
if (!IsIcebergCatalogServer(serverName))
return false;

ScrubUserMappingSecrets(processUtilityParams->queryString, options);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: redact sounds more common than scrub in this context.

static List *
LookupUserMappingOptions(Oid serverid)
{
HeapTuple tp;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postgres already have UserMapping * GetUserMapping(Oid userid, Oid serverid). Can we reuse it?

Copy link
Copy Markdown
Collaborator Author

@sfc-gh-npuka sfc-gh-npuka Apr 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetUserMapping(userid, serverid) raises an ERROR if no mapping is found for the user or PUBLIC. LookupUserMappingOptions(serverid) allows the caller to gracefully fall through to lower-priority credential sources.

Comment on lines +458 to +459
char **clientId, char **clientSecret,
char **scope)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RestCatalogOptions struct can have a substruct called RestCatalogSecrets which contains user mapping options.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, we could technically have multiple user mappings per server, not sure if I'd go with a substruct.
Let me know if you feel strongly about this.

@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch 2 times, most recently from 92f71ed to f1bb78f Compare April 16, 2026 12:48
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch 2 times, most recently from 61b714f to 507450b Compare April 16, 2026 16:14
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch 3 times, most recently from 5e2d2a1 to 70891ec Compare April 21, 2026 16:28
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_reconfig branch 2 times, most recently from a0e5ae4 to 87dbbc1 Compare May 26, 2026 11:10
Until now an iceberg_catalog server carried its own client_id /
client_secret on the SERVER row.  That gave every role on the
cluster the same credentials and forced operators to recreate the
server when secrets rotated.  Move credentials to per-user state
and add a platform-provided file as a middle layer.

Credential resolution (lowest to highest priority):

  1. pg_lake_iceberg.rest_catalog_* GUC defaults
  2. iceberg_catalog SERVER options (non-secret only)
  3. $PGDATA/catalogs.conf            (user-created servers only)
  4. pg_user_mapping options          (user-created servers only)

The built-in pg_lake_rest_catalog (catalog='rest') stops after step 2
on purpose: its credentials live exclusively in the GUCs so the
built-in stays a single, global, instance-wide configuration with no
hidden per-user view.  CREATE/ALTER USER MAPPING is rejected against
all three built-in long names; catalogs.conf is also ignored for the
built-in 'rest' catalog.

Notable pieces:

* iceberg_catalog_option_descs[] now carries a CATALOG_OPT_CTX_*
  bitmask per option.  The same table drives the validator, the
  per-context "Valid options are: ..." hint, and the option->struct
  applier.  client_id / client_secret are USER MAPPING-only; scope is
  accepted on both, with the USER MAPPING value winning because it is
  applied last during resolution.

* RestCatalogOptions gains a umid field; the token cache key is now
  (serverOid, umid) so different SET ROLEs in the same backend each
  get their own user mapping's credentials.  A USERMAPPINGOID syscache
  callback invalidates cached tokens on CREATE/ALTER/DROP USER MAPPING.

* ValidateRestCatalogOptions now performs an early auth-type-aware
  credentials check at resolution time:
    - client_secret is always required
    - client_id is required unless rest_auth_type='horizon'
  Missing credentials surface as "no credentials found for REST
  catalog ..." with ERRCODE_FDW_OPTION_NAME_NOT_FOUND.  The per-field
  checks inside FetchRestCatalogAccessToken are kept as defense in
  depth and now carry the same errcode.

* New ProcessUtility handler RedactRestCatalogUserMappingSecrets
  scrubs client_id / client_secret from queryString in place on
  CREATE/ALTER USER MAPPING for any iceberg_catalog server (built-in
  long names included).  Handles plain '', E'', and U&'' literal
  forms with their escape rules.  Registered after the DDL validator
  so it runs first (the handler list is prepend-LIFO), ensuring the
  failing built-in-server path never leaks secrets into the ereport
  context.  DDL itself reads option values from DefElem->arg, so
  pg_user_mapping still stores plaintext credentials -- only the
  query string surfaces (pg_stat_statements, log_min_duration_statement,
  ereport context) see the redacted form.

* New PGC_SIGHUP GUC pg_lake_iceberg.catalogs_conf_path lets
  operators point at an absolute path; the default is the relative
  'catalogs.conf' resolved against DataDir.

Tests:

* test_iceberg_catalog_server.py picks up the user-mapping DDL
  surface (per-context option lists and hints, valid/invalid options
  on each side, FOR CURRENT_USER with all three options,
  per-role mappings on the same server, dependency-driven rejections,
  built-in long-name blocks), the redaction handler (CREATE/ALTER,
  '' escape, E'' escape, scope preservation, non-iceberg FDW skip,
  built-in-rejection-after-redaction, plaintext storage preservation),
  and credential resolution via catalogs.conf (file-only success,
  USER MAPPING wins over the file, no-credentials-anywhere error,
  scope from the file, absolute catalogs_conf_path, built-in
  ignores the file).

* test_modify_iceberg_rest_table.py: client_id / client_secret are
  dropped from the server-option-overrides-GUC parametrization (they
  no longer belong on SERVER), and a new parametrized
  test_user_mapping_credential_overrides_guc covers the same
  resolution-order direction with USER MAPPING in step 4.

* test_writable_iceberg_common.py: the shared fixture for the writable
  user-created REST server now puts credentials on a PUBLIC user
  mapping, and drops the server with CASCADE to sweep up the mapping.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: sfc-gh-npuka <naisila.puka@snowflake.com>
@sfc-gh-npuka sfc-gh-npuka force-pushed the naisila/catalog_user_mapping branch from 507450b to 8ed0b8e Compare May 26, 2026 15:21

if (!IsBuiltinCatalogServerName(serverName))
{
ApplyCatalogsConfOverrides(opts, serverName);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised the config file overrides the server-level options. Is that intentional?

&clientId, &clientSecret, &scope))
return;

if (clientId != NULL)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find it a bit unintuitive to guess which server settings are configurable via catalogs.json. Is it useful to allow all fo them?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New Catalog Configuration via CREATE SERVER

3 participants