feat(bff): introduce support of multiple auth methods (internal, user_token) #918

ederign · 2025-03-29T18:45:43Z

Description

This PR introduces a pluggable kubernetes client authentication mechanism for the Model Registry UI BFF, enabling support for two modes:

internal (default and current one) — uses the credentials of the backend service:
- In-cluster: pod service account
- Local dev: current kubeconfig user context
- Identity is inferred from kubeflow-userid and kubeflow-groups headers
user_token — uses a user-provided token from the Authorization: Bearer <your-token> header
- Used to create a SelfSubjectAccessReview
- Identity is the token itself

To support this, I introduced a KubernetesClientFactory abstraction that cleanly encapsulates all logic related to Kubernetes client creation. This avoids leaking the client instance throughout the codebase, enables per-request instantiation for token-based clients, and simplifies mocking in tests. The result is a more modular, secure, and testable architecture.

File-by-file breakdown of significant changes

Makefile

Added AUTH_METHOD ?= internal default var
Passed --auth-method flag to go run ./cmd in run target

README.md

Clarified authentication methods and updated cURL examples to reflect both kubeflow-userid and Authorization: Bearer headers
Explained the meaning of each mode to the developer

cmd/main.go

Added CLI flag --auth-method with validation (internal or user_token)
Hooked it into config.EnvConfig
Updated shutdown path to simplify logic

internal/api/app.go

Replaced single client instance with a KubernetesClientFactory
Factory is instantiated depending on:
- Whether we are mocking K8s (envtest)
- Chosen auth-method
Adjusted App struct to store the factory and optionally the envtest for cleanup in mock mode
All route handlers now use dynamically retrieved clients
Replaced PerformSARon... middleware with generalized RequireAccessToService and RequireListServiceAccessInNamespace

internal/api/middleware.go

Introduced InjectRequestIdentity middleware to create RequestIdentity from headers based on auth-method
Refactored all SAR authorization logic to work off a KubernetesClientFactory + per-request identity
Clean separation of concerns for:
- AttachRESTClient — resolves and injects REST client
- RequireAccessToService — enforces permission on a named K8s Service
- RequireListServiceAccessInNamespace — enforces permission to list services in a namespace

internal/api/errors.go

Updated error type references to use mrserver.HTTPError (instead of integrations)
Centralized error handling and serialization

internal/api/*.go (handler files)

Updated handlers to use new interfaces: mrserver.HTTPClientInterface
Replaced old identity extraction logic with context-based RequestIdentity
Cleaned up duplicate logic for user/group extraction

internal/api/*_test.go

Updated all tests to:
- Use kubernetesMockedStaticClientFactory instead of raw client
- Provide a RequestIdentity struct directly
- Maintain correctness for both valid and forbidden paths

internal/integrations/kubernetes

Introduced KubernetesClientFactory interface with two implementations:
- StaticClientFactory (our old client) — for internal auth (shared client, impersonation support)
- TokenClientFactory — creates new clients per token
Separated InternalKubernetesClient (our old client) and TokenKubernetesClient logic for SAR vs. SelfSAR

internal/integrations/kubernetes/k8mocks

SetupEnvTest() now supports both client modes (internal & token)
Adds ability to simulate SSAR and SAR scenarios for testing

How Has This Been Tested?

This is how I tested and my suggestion for anyone reviewing this is to test it carefully, has this touches in a crucial part of our bff.

AuthMethodInternal = "internal" (default)

First, to make sure that I didn't break anything on the AuthMethodInternal = "internal" (default)

Local mocked development

make run MOCK_K8S_CLIENT=true MOCK_MR_CLIENT=true
✅ curl -i -H "kubeflow-userid: [email protected]" "localhost:4000/api/v1/model_registry?namespace=kubeflow"
❌ 403: curl -i -H "kubeflow-userid: [email protected]" "localhost:4000/api/v1/model_registry?namespace=kubeflow"
✅ curl -i -H "kubeflow-userid: [email protected]" "localhost:4000/api/v1/model_registry?namespace=dora-namespace"

Run the front-end and do a quick sanity check
i.e. [email protected] should be able to see all namespaces (cluster admin on env test)
On a kubeflow installation, change the deployment to use image: quay.io/ederignatowicz/model-registry-ui-auth:latest . I've build this image to test it... you will see on the logs a "Starting Model Registry"

A word of warning. If you have an old installation, make sure to update the healthcheck path for api/healthcheck instead of api/v1/healthcheck

Do a quick sanity check there just to make sure I didn't break anything! :)

AuthMethodInternal = "user_token"

Let's try the new auth mode.

Local mocked development

make run MOCK_K8S_CLIENT=true MOCK_MR_CLIENT=true AUTH_METHOD=user_token
❌ curl -i -H "kubeflow-userid: [email protected]" "localhost:4000/api/v1/model_registry?namespace=kubeflow"
❌ curl -i -H "Authorization: Bearer $TOKEN" "localhost:4000/api/v1/model_registry?namespace=kubeflow"
[email protected] token is FAKE_CLUSTER_ADMIN_TOKEN
✅curl -i -H "Authorization: Bearer FAKE_CLUSTER_ADMIN_TOKEN" "localhost:4000/api/v1/model_registry?namespace=kubeflow"
FAKE_DORA_TOKEN is [email protected] no access for kubeflow namespace
❌ curl -i -H "Authorization: Bearer FAKE_DORA_TOKEN" "localhost:4000/api/v1/model_registry?namespace=kubeflow"
✅ curl -i -H "Authorization: Bearer FAKE_DORA_TOKEN" "localhost:4000/api/v1/model_registry?namespace=dora-namespace"

BFF Connected to Kubeflow Cluster
This will work just on BFF because our front end doesn't support tokens yet. Standalone=true enable namespaces endpoint

make run MOCK_K8S_CLIENT=false MOCK_MR_CLIENT=true AUTH_METHOD=user_token STANDALONE_MODE=true

1-) Create SAs

kubectl create sa admin-dora -n ns-dora
kubectl create sa limited-bella -n ns-bella

2-) Create namespaces

kubectl create ns ns-dora
kubectl create ns ns-bella

3-) Create roles

# rbac/ns-access.yaml
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: ns-reader
  namespace: ns-dora
rules:
  - apiGroups: [""]
    resources: ["namespaces"]
    verbs: ["get", "list"]

# rbac/service-access.yaml
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: service-reader
  namespace: ns-dora
rules:
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["get", "list"]

do the same for namespace ns-bella

4-) Apply them to the right namespaces:

kubectl apply  -f ns-access.yaml  -n ns-dora
kubectl apply  -f ns-access.yaml  -n ns-bella
kubectl apply -f service-access.yaml -n ns-bella 
kubectl apply -f service-access.yaml -n ns-dora

5-) Create RoleBindings
Bind admin-dora to both namespaces

kubectl create rolebinding admin-ns-access1 \
  --role=ns-reader \
  --serviceaccount=ns-dora:admin-dora \
  -n ns-dora

kubectl create rolebinding admin-svc-access1 \
  --role=service-reader \
  --serviceaccount=ns-dora:admin-dora \
  -n ns-dora 

kubectl create rolebinding admin-ns-access-user1 \
  --role=ns-reader \
  --serviceaccount=ns-dora:admin-dora \
  -n ns-bella 
kubectl create rolebinding admin-svc-access-user1 \
  --role=service-reader \
  --serviceaccount=ns-dora:admin-dora \
  -n ns-bella

Bind limited-bella to only ns-bella

kubectl create rolebinding limited-ns-access-user2 \
  --role=ns-reader \
  --serviceaccount=ns-bella:limited-bella \
  -n ns-bella 
kubectl create rolebinding limited-svc-access-user2 \
  --role=service-reader \
  --serviceaccount=ns-bella:limited-bella \
  -n ns-bella

5-) Create Model Registry Services

kind: Service
apiVersion: v1
metadata:
  labels:
    app: model-registry-service
    app.kubernetes.io/component: model-registry
    app.kubernetes.io/instance: model-registry-service
    app.kubernetes.io/name: model-registry-service
    app.kubernetes.io/part-of: model-registry
    component: model-registry
  annotations:
    displayName: Kubeflow Model Registry
    description: An example model registry
  name: bella-user-registry
spec:
  selector:
    component: model-registry-server
  type: ClusterIP
  ports:
  - port: 8080
    protocol: TCP
    appProtocol: http
    name: http-api
  - port: 9090
    protocol: TCP
    appProtocol: grpc
    name: grpc-api

kubectl apply -f mr-service.yaml -n ns-dora

change ns name and apply to ns-bella

6-) Get Tokens

kubectl create token admin-dora -n ns-dora --duration=24h > /tmp/admin-dora.token
ADMIN_DORA=$(cat /tmp/admin-dora.token)
echo $ADMIN_DORA

✅ curl -i -H "Authorization: Bearer $ADMIN_DORA" "localhost:4000/api/v1/model_registry?namespace=ns-dora"
✅ curl -i -H "Authorization: Bearer: $ADMIN_DORA" "localhost:4000/api/v1/model_registry?namespace=ns-bella"
❌curl -i -H "Authorization: Bearer: $ADMIN_DORA" "localhost:4000/api/v1/model_registry?namespace=default"

kubectl create token limited-bella -n ns-bella --duration=24h > /tmp/limited-bella.token
BELLA=$(cat /tmp/limited-bella.token)
echo $BELLA

❌ curl -i -H "Authorization: Bearer $BELLA" "localhost:4000/api/v1/model_registry?namespace=default"
❌ curl -i -H "Authorization: Bearer $BELLA" "localhost:4000/api/v1/model_registry?namespace=ns-dora"
✅ curl -i -H "Authorization: Bearer $BELLA" "localhost:4000/api/v1/model_registry?namespace=ns-bella"

7-) Custom Headers

make run MOCK_K8S_CLIENT=false MOCK_MR_CLIENT=true AUTH_METHOD=user_token STANDALONE_MODE=true  AUTH_TOKEN_HEADER=X-Forwarded-Access-Token AUTH_TOKEN_PREFIX=""
❌ curl -i -H "Authorization: Bearer $BELLA" "localhost:4000/api/v1/model_registry?namespace=ns-bella"
✅ curl -i -H "X-Forwarded-Access-Token: $BELLA" "localhost:4000/api/v1/model_registry?namespace=ns-bella"

Merge criteria:

All the commits have been signed-off (To pass the DCO check)

The commits have meaningful messages; the author will squash them after approval or in case of manual merges will ask to merge with squash.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work.
Code changes follow the kubeflow contribution guidelines.
For first time contributors: Please reach out to the Reviewers to ensure all tests are being run, ensuring the label ok-to-test has been added to the PR.

If you have UI changes

The developer has added tests or explained why testing cannot be added.
Included any necessary screenshots or gifs if it was a UI change.
Verify that UI/UX changes conform the UX guidelines for Kubeflow.

ederign · 2025-03-29T18:56:44Z

/assign @alexcreasy

rareddy · 2025-03-29T21:17:04Z

@dhirajsb pls review

christianvogt · 2025-03-31T13:35:12Z

clients/ui/bff/internal/integrations/kubernetes/token_k8s_client.go

+	ctx, cancel := context.WithTimeout(ctx, 30*time.Second)
+	defer cancel()
+
+	for _, verb := range []string{"get", "list"} {


This is copied from the old implementation, but I don't think we need both get and list when no name attribute is supplied to a SSAR.

Christian, I've added to my list to double check with MR team, but I'm almost sure that they explicitly asked for this.

christianvogt · 2025-03-31T13:37:34Z

clients/ui/bff/internal/integrations/kubernetes/token_k8s_client.go

+	if err != nil {
+		kc.Logger.Warn("user is not allowed to list namespaces or failed to list namespaces")
+		return []corev1.Namespace{}, nil
+	}


log error instead? Also, do include the error in the log to help with debugging.

Why does this function suppress the error instead of sending it back to the caller?

christianvogt · 2025-03-31T13:45:27Z

clients/ui/bff/internal/api/middleware.go

-		}
+		var identity *kubernetes.RequestIdentity
+
+		switch app.config.AuthMethod {


I don't particularly like seeing a switch on switch app.config.AuthMethod { (here or elsewhere) aside from creating a single client. It defeats the purpose of abstracting the implementation details of the individual clients to have different behaviors.

christianvogt · 2025-03-31T13:53:22Z

clients/ui/bff/internal/integrations/kubernetes/shared_k8s_client.go

+	return services, nil
+}
+
+func (kc *SharedClientLogic) GetServiceDetailsByName(sessionCtx context.Context, namespace string, serviceName string) (ServiceDetails, error) {


This is inefficient to fetch all services only to find the one by name instead of fetching the one by name in the first place.

christianvogt · 2025-03-31T13:57:57Z

clients/ui/bff/internal/integrations/kubernetes/types.go

+type RequestIdentity struct {
+	UserID string
+	Groups []string
+	Token  string
+}


Ideally each auth method should have their own type to avoid having to mix every property together. But I can accept that all these properties have a relation to identity to stay together as well.

We discussed this in a call and got an agreement that this type will be reused on MR API calls.

ederign · 2025-03-31T19:53:42Z

@christianvogt thank you so much for the review, I believe I've addressed all our points.

clients/ui/bff/README.md

christianvogt · 2025-04-02T19:26:16Z

Tested the PR as per the curl commands in the description.
Also successfully ran the UI & BFF locally in conjunction with the central dashboard.

/lgtm

google-oss-prow · 2025-04-02T19:26:20Z

@christianvogt: changing LGTM is restricted to collaborators

In response to this:

Tested the PR as per the curl commands in the description.
Also successfully ran the UI & BFF locally in conjunction with the central dashboard.

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

…_token) - Introduce auth-method flag (default: internal) to select auth strategy - internal: uses pod service account (in-cluster) or current kubeconfig (local) - user_token: uses bearer token from X-Forwarded-Access-Token header - Implement separate SAR (internal) and SSAR (user_token) logic - Clarify behavior and usage in README Signed-off-by: Eder Ignatowicz <[email protected]>

Signed-off-by: Eder Ignatowicz <[email protected]>

… reuse methods) Signed-off-by: Eder Ignatowicz <[email protected]>

lucferbux

There's a couple of things I wanna discuss first I don't fully understand the PR, moving the conversation to other channels.

lucferbux · 2025-04-11T12:23:22Z

clients/ui/bff/internal/models/user.go

@@ -1,6 +1,5 @@
 package models

 type User struct {
-	UserID       string `json:"userId"`
-	ClusterAdmin bool   `json:"clusterAdmin"`
+	UserID string `json:"userId"`


Suggested change

UserID string `json:"userId"`

UserID string `json:"userId"`

ClusterAdmin bool `json:"clusterAdmin"`

@lucferbux fixed on 1a27427

lucferbux · 2025-04-11T12:32:57Z

clients/ui/bff/internal/repositories/user.go


+	if formattedUser == "" {


We should still get both the ClusterAdmin check and the username

Fixed on 1a27427

lucferbux · 2025-04-11T12:34:39Z

clients/ui/bff/internal/repositories/user.go

+	if formattedUser == "" {
+		//if we are using token based auth, we still need to implement how to
+		//safely get the user from the token
+		formattedUser = "unknown"


You can get the user from a token with kubectl auth whoami, this calls /apis/authentication.k8s.io/v1/selfsubjectreviews, which returns values like:

ATTRIBUTE VALUE Username kubernetes-admin Groups [kubeadm:cluster-admins system:authenticated] Extra: authentication.kubernetes.io/credential-id [X509SHA256=230423670e4531f1d6b8b5a8a9680954b3bb95b35353019a1944b29d5ad03148]

Thanks for the pointers Lucas! I fixed it 1a27427

Yes, I can confirm is working now:

…ract user name in token case Signed-off-by: Eder Ignatowicz <[email protected]>

lucferbux · 2025-04-16T09:53:01Z

Just commenting here, @alexcreasy @christianvogt if @ederign changes the token to Bearer ... it's ok for me, I'll be out the rest of the week, main issue is already fixed, so if you guys can take a look I'm up to approve it.

- Introduced `AuthTokenHeader` and `AuthTokenPrefix` fields to EnvConfig - Default token extraction uses `Authorization` header with `Bearer ` prefix - Updated TokenClientFactory to dynamically parse token using configured header and prefix - Added new CLI flags: `--auth-token-header` and `--auth-token-prefix` - Updated Makefile to support overriding header and prefix via `AUTH_TOKEN_HEADER` and `AUTH_TOKEN_PREFIX` - Improved error messages and testability of token parsing logic - Added Ginkgo unit tests for TokenClientFactory.ExtractRequestIdentity with and without prefixes - Cleaned up README: - Replaced all `X-Forwarded-Access-Token` references with `Authorization: Bearer` - Documented how to override token header and prefix via CLI, env, or Makefile Signed-off-by: Eder Ignatowicz <[email protected]>

christianvogt · 2025-04-16T19:58:08Z

Re-tested following the curl commands successfully.
/lgtm

google-oss-prow · 2025-04-16T19:58:12Z

@christianvogt: changing LGTM is restricted to collaborators

In response to this:

Re-tested following the curl commands successfully.
/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

alexcreasy

I've just identified one small change, otherwise this looks all good!

alexcreasy · 2025-04-17T17:13:50Z

clients/ui/bff/internal/api/middleware.go

 		}

-		allowed, err := app.kubernetesClient.PerformSARonSpecificService(user, userGroups, namespace, modelRegistryID)
+		allowed, err := client.CanAccessServiceInNamespace(r.Context(), identity, namespace, serviceName)
+
 		if err != nil {
 			app.forbiddenResponse(w, r, "failed to perform SAR: %v")


It's good security practice not to return any additional information to the user for an authentication response as you can leak clues to the internal architecture of the system, that could lead to CWEs like a response discrepancy.

It's probably a good idea to alter the forbiddenResponse function to not write the error message to the http response and just log it, whilst returning simply 403 forbidden.

I'll fix in a FUP PR!

Thank you for the review!

alexcreasy · 2025-04-17T17:36:29Z

/lgtm

alexcreasy · 2025-04-17T17:36:39Z

/approve

google-oss-prow · 2025-04-17T17:36:45Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alexcreasy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~clients/ui/OWNERS~~ [alexcreasy]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Eder Ignatowicz <[email protected]>

google-oss-prow bot requested review from Al-Pragliola and andreyvelich March 29, 2025 18:45

github-actions bot added the Area/UI label Mar 29, 2025

google-oss-prow bot added the size/XXL label Mar 29, 2025

ederign force-pushed the auth-token branch from 7a58e71 to f3554c9 Compare March 29, 2025 18:55

google-oss-prow bot assigned alexcreasy Mar 29, 2025

ederign force-pushed the auth-token branch from f3554c9 to 571d761 Compare March 29, 2025 19:02

christianvogt reviewed Mar 31, 2025

View reviewed changes

ederign force-pushed the auth-token branch 2 times, most recently from 80e4980 to 0e6efb1 Compare March 31, 2025 20:25

christianvogt reviewed Apr 2, 2025

View reviewed changes

clients/ui/bff/README.md Show resolved Hide resolved

ederign added 4 commits April 11, 2025 07:19

feat(bff): looks like we don't need to override the certs

c99f7cc

Signed-off-by: Eder Ignatowicz <[email protected]>

Fixes after code review by Christian

da5e981

Signed-off-by: Eder Ignatowicz <[email protected]>

fixing mock factory creation (removing real client creation but still…

c74a659

… reuse methods) Signed-off-by: Eder Ignatowicz <[email protected]>

ederign force-pushed the auth-token branch from 38a8686 to c74a659 Compare April 11, 2025 11:20

lucferbux suggested changes Apr 11, 2025

View reviewed changes

Adding back clusterAdmin to user, implement it to token, and also ext…

1a27427

…ract user name in token case Signed-off-by: Eder Ignatowicz <[email protected]>

ederign force-pushed the auth-token branch from 4ab1ba2 to 6eb73ac Compare April 16, 2025 14:07

alexcreasy suggested changes Apr 17, 2025

View reviewed changes

alexcreasy reviewed Apr 17, 2025

View reviewed changes

google-oss-prow bot added the lgtm label Apr 17, 2025

google-oss-prow bot added the approved label Apr 17, 2025

google-oss-prow bot merged commit 340947a into kubeflow:main Apr 17, 2025
18 checks passed

Al-Pragliola mentioned this pull request Apr 18, 2025

periodic sync upstream KF to midstream ODH opendatahub-io/model-registry#192

Merged

ederign added a commit to ederign/model-registry that referenced this pull request May 7, 2025

chore(bff): cleanup as a fup to kubeflow#918

9074331

Signed-off-by: Eder Ignatowicz <[email protected]>

ederign added a commit to ederign/model-registry that referenced this pull request May 7, 2025

chore(bff): cleanup as a fup to kubeflow#918

fb0ee29

Signed-off-by: Eder Ignatowicz <[email protected]>

ederign mentioned this pull request May 7, 2025

chore(bff): cleanup as a fup to #918 #1073

Merged

4 tasks

google-oss-prow bot pushed a commit that referenced this pull request May 7, 2025

chore(bff): cleanup as a fup to #918 (#1073)

565f49e

Signed-off-by: Eder Ignatowicz <[email protected]>

	UserID string `json:"userId"`
	UserID string `json:"userId"`
	ClusterAdmin bool `json:"clusterAdmin"`

feat(bff): introduce support of multiple auth methods (internal, user_token) #918

feat(bff): introduce support of multiple auth methods (internal, user_token) #918

Uh oh!

Conversation

ederign commented Mar 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

File-by-file breakdown of significant changes

How Has This Been Tested?

AuthMethodInternal = "internal" (default)

AuthMethodInternal = "user_token"

Merge criteria:

Uh oh!

ederign commented Mar 29, 2025

Uh oh!

rareddy commented Mar 29, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ederign commented Mar 31, 2025

Uh oh!

Uh oh!

christianvogt commented Apr 2, 2025

Uh oh!

google-oss-prow bot commented Apr 2, 2025

Uh oh!

lucferbux left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lucferbux commented Apr 16, 2025

Uh oh!

christianvogt commented Apr 16, 2025

Uh oh!

google-oss-prow bot commented Apr 16, 2025

Uh oh!

alexcreasy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexcreasy commented Apr 17, 2025

Uh oh!

alexcreasy commented Apr 17, 2025

Uh oh!

ederign commented Mar 29, 2025 •

edited

Loading