Skip to content

Conversation

@SangJunBak
Copy link
Contributor

@SangJunBak SangJunBak commented Dec 16, 2025

Rendered version: https://github.com/MaterializeInc/materialize/blob/52a83bcf8c3186c527dad2cf15876b46ce06fd5d/doc/developer/design/20251215_jwt_authentication.md

Motivation

Tips for reviewer

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

Copy link
Contributor

@jasonhernandez jasonhernandez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few discussion points. We can agree to leave the catalog as an open discussion / draft.

Comment on lines +68 to +78
### Solution proposal: The user should be disabled from logging in when a user is de-provisioned. However, the database level role should still exist.

When doing pgwire jwt authentication, we can accept a cleartext password of the form `access=<ACCESS_TOKEN>&refresh=<REFRESH_TOKEN>` where `&` is a delimiter and `refresh=<REFRESH_TOKEN>` is optional. The JWT authenticator will then try to authenticate again and fetch a new access token using the refresh token when close to expiration (using the token API URL in the spec above). If the refresh token doesn’t exist, the session will invalidate. The implementation will be very similar to how we refresh tokens for the Frontegg authenticator. This would require users to have their IDP client generate `refresh` tokens.

By suggesting a short time to live for access tokens, this accomplishes invalidating sessions on deprovisioning of a user. When admins deprovision a user, the next time the user tries to authenticate or refresh their access token, the token API will not allow the user to login but will keep the role in the database.

**Alternative: Use SASL Authentication using the OAUTHBEARER mechanism rather than a cleartext password**

This would be the most Postgres compatible way of doing this and is what it uses for its `oauth` authentication method. However, it may run into compatibility issues with clients. For example in `psql`, there’s no obvious way of sending the bearer token directly without going through libpq's device-grant flow. Furthermore, assuming access tokens are short lived, this could lead to poor UX given there’s no native way to re-authenticate a pgwire session. Finally, our HTTP endpoints wouldn’t be able to support this given they don’t support SASL auth.

OAUTHBEARER reference: [https://www.postgresql.org/docs/18/sasl-authentication.html#SASL-OAUTHBEARER](https://www.postgresql.org/docs/18/sasl-authentication.html#SASL-OAUTHBEARER)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this approach. I would roughly coin this as "bring your own JWT issuer / JWK"

Comment on lines +80 to +92
### Solution proposal: The end user is able to create a token to connect to materialize via psql / postgres clients

Unfortunately, to provide a nice flow to generate the necessary access token and refresh token, we’d need to control the client. Thus we’ll leave the retrieval of the access token/refresh token to the user, similar to CockroachDB.

**Alternative: Revive the mz CLI**

We have an `mz` CLI that’s catered to Cloud and no longer supported. We can potentially bring this back.

**Open question:** Is there anything we can do on our side to easily provide access tokens / refresh tokens to the user without controlling the client? This feels like the missing piece between JWT authentication and something like `aws sso login` in the AWS CLI

### Solution proposal: The end user is able to visit the Materialize console, and sign in with their IdP

A generic Frontend SSO redirect flow would need to be implemented to retrieve an access token and refresh token. However once retrieved, the SQL HTTP / WS API endpoints can use bearer authorization like Cloud and accept the access token. The Console would be in charge of refreshing the access token. The Console work is out of scope for this design document.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can exclude this from scope and let customers use existing tooling to get JWTs. I'm not sure this is necessary or particularly useful for customers. We might need something for internal testing, but I would start with just meeting that need for now.

Comment on lines +108 to +116
### Tests:

- Successful login (e2e mzcompose)
- Invalidating the session on access token expiration and no refresh token (Rust unit test)
- A token should successfully refresh if the access token and refresh token are valid (Rust unit test)
- Session should error if access token is invalid (Rust unit test)
- Session should error if refresh token is invalid (Rust unit test)
- De-provisioning a user should invalidate the refresh token (e2e mzcompose)
- Platform-check simple login check (platform-check framework)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Tests:
- Successful login (e2e mzcompose)
- Invalidating the session on access token expiration and no refresh token (Rust unit test)
- A token should successfully refresh if the access token and refresh token are valid (Rust unit test)
- Session should error if access token is invalid (Rust unit test)
- Session should error if refresh token is invalid (Rust unit test)
- De-provisioning a user should invalidate the refresh token (e2e mzcompose)
- Platform-check simple login check (platform-check framework)
### Tests:
- Successful login (e2e mzcompose)
- Invalidating the session on access token expiration and no refresh token (Rust unit test)
- A token should successfully refresh if the access token and refresh token are valid (Rust unit test)
- Session should error if access token is invalid (Rust unit test)
- Session should error if refresh token is invalid (Rust unit test)
- De-provisioning a user should invalidate the refresh token (e2e mzcompose)
- Platform-check simple login check (platform-check framework
- JWTs should only be accepted when a valid JWK is set (we do not want to accept JWTs that are not signed with a real, cryptographically sound key)

Comment on lines +118 to +129
## Phase 2: Map the `admin` claim to a user’s superuser attribute

Based on the `admin` claim, we can set the `superuser` attribute we store in the catalog for password authentication. We do this by doing the following:

- First, in our authenticator, save `admin` inside the user’s `external_metadata`
- Next, in `handle_startup_inner()` we diff them with the user’s current superuser status and if there’s a difference, apply the changes with an `ALTER` operation. We can use `catalog_transact_with_context()` for this.
- On error (e.g. if the ALTER isn’t allowed), we’ll end the connection with a descriptive error message. This is a similar pattern we use for initializing network policies.

This is similar to how we identify superusers in Frontegg auth, except we also treat it as an operation to update the catalog
- We can keep using the session’ metadata as the source of truth to keep parity with Cloud, but eventually we’ll want to use the catalog as the source of truth for all. We can call this **out of scope.**

Prototype: [https://github.com/MaterializeInc/materialize/pull/34372/commits](https://github.com/MaterializeInc/materialize/pull/34372/commits)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be careful about cases like this:

User authenticates with JWT as an admin
User has admin permissions revoked in service issuing JWTs
User logs in with password.
How will we know that they had their admin permissions revoked? How will a customer confidently ensure that they're able to revoke admin access?

there are a few solutions:

  1. bind auth methods to user ids (i.e. they can't set / use a password after authenticating with a JWT)
  2. periodic sync (rate limits could be a concern)
  3. live validation to some external endpoint (rate limits could be a concern)
  4. don't support any heterogeneity in auth methods at all (i.e. if JWT is enabled, only accept JWTs, rely on the issuer / expiration time)

In general, I want to avoid any temptation or risk of confusion where we might rely on stale data in the catalog when another source of truth exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants