Skip to content
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,8 @@ configMapGenerator:
- ISTIO_INGRESS_GATEWAY_PRINCIPAL="cluster.local/ns/istio-system/sa/istio-ingressgateway-service-account"
- NOTEBOOK_CONTROLLER_PRINCIPAL="cluster.local/ns/kubeflow/sa/notebook-controller-service-account"
- KFP_UI_PRINCIPAL="cluster.local/ns/kubeflow/sa/ml-pipeline-ui"
- SERVICE_MESH_MODE="istio-sidecar"
- WAYPOINT_NAME="waypoint"
- WAYPOINT_NAMESPACE=""
- CREATE_WAYPOINT="false"
name: config
10 changes: 6 additions & 4 deletions components/profile-controller/config/manager/manager.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,12 @@ spec:
- $(WORKLOAD_IDENTITY)
- "-service-mesh-mode"
- $(SERVICE_MESH_MODE)
- "-gateway-name"
- $(GATEWAY_NAME)
- "-gateway-namespace"
- $(GATEWAY_NAMESPACE)
- "-waypoint-name"
- $(WAYPOINT_NAME)
- "-waypoint-namespace"
- $(WAYPOINT_NAMESPACE)
- "-create-waypoint"
- $(CREATE_WAYPOINT)
envFrom:
- configMapRef:
name: config
Expand Down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I 'm not too familiar with overlays but I think we should be able split the kubeflow to two flavours in order to avoid manifests duplication and only change what's needed in sidecar and ambient.

Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,8 @@ spec:
- $(USERID_HEADER)
- "-userid-prefix"
- $(USERID_PREFIX)
- "-service-mesh-mode"
- $(SERVICE_MESH_MODE)
envFrom:
- configMapRef:
name: config
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,10 @@ spec:
- name: manager
env:
- name: SERVICE_MESH_MODE
value: ambient
- name: GATEWAY_NAME
value: kubeflow-gateway
- name: GATEWAY_NAMESPACE
value: istio-system
value: istio-ambient
- name: WAYPOINT_NAME
value: waypoint
- name: WAYPOINT_NAMESPACE
value: ""
- name: CREATE_WAYPOINT
value: "true"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it defaults to true while here to false. LEt's keep the default to false everywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed ambient patch to consistently use "false"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need those here since we have the envFrom in the deployment and the same envvars in the configmap, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kept only SERVICE_MESH_MODE=istio-ambient override, removed redundant WAYPOINT_NAME, WAYPOINT_NAMESPACE, and CREATE_WAYPOINT

223 changes: 196 additions & 27 deletions components/profile-controller/controllers/profile_controller.go
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@madmecodes was taking a look and trying out the PR, and I think I see an issue.

Currently the Profile Controller doesn't modify at all the AuthorizationPolicy ns-owner-access-istio. This is problematic as it then doesn't get applied to the waypoint proxy (for the L7 traffic). The status also in the policy complains about the http rules.

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  annotations:
    role: admin
    user: [email protected]
  creationTimestamp: "2025-12-04T15:14:16Z"
  generation: 1
  name: ns-owner-access-istio
  namespace: profilename
  ownerReferences:
  - apiVersion: kubeflow.org/v1
    blockOwnerDeletion: true
    controller: true
    kind: Profile
    name: profilename
    uid: 1ea57e39-a414-4491-a9fb-91741b2467ed
  resourceVersion: "8791241"
  uid: 1b3e2334-cc95-4da9-b093-dc81522e0c54
spec:
  rules:
  - from:
    - source:
        principals:
        - cluster.local/ns/kubeflow/sa/istio-ingress-k8s-istio
        - cluster.local/ns/kubeflow/sa/kfp-ui
    when:
    - key: request.headers[kubeflow-userid]
      values:
      - [email protected]
  - when:
    - key: source.namespace
      values:
      - profilename
  - to:
    - operation:
        paths:
        - /healthz
        - /metrics
        - /wait-for-drain
  - from:
    - source:
        principals:
        - cluster.local/ns/kubeflow/sa/jupyter-controller
    to:
    - operation:
        methods:
        - GET
        paths:
        - '*/api/kernels'
status:
  conditions:
  - lastTransitionTime: "2025-12-04T15:14:16.091234105Z"
    message: 'ztunnel does not support HTTP attributes (found: methods, paths, request.headers[kubeflow-userid]).
      In ambient mode you must use a waypoint proxy to enforce HTTP rules. Within
      an ALLOW policy, rules matching HTTP attributes are omitted. This will be more
      restrictive than requested.'
    observedGeneration: "1"
    reason: UnsupportedValue
    status: "True"
    type: ZtunnelAccepted

We should instead add a targetRefs for the AuthorizationPolicy to target the waypoint proxy instead, if it's on ambient.

Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ import (
"sigs.k8s.io/controller-runtime/pkg/handler"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
"sigs.k8s.io/controller-runtime/pkg/source"
gatewayv1beta1 "sigs.k8s.io/gateway-api/apis/v1beta1"
)

const AUTHZPOLICYISTIO = "ns-owner-access-istio"
Expand Down Expand Up @@ -92,8 +93,9 @@ type ProfileReconciler struct {
WorkloadIdentity string
DefaultNamespaceLabelsPath string
ServiceMeshMode string
GatewayName string
GatewayNamespace string
WaypointName string
WaypointNamespace string
CreateWaypoint bool
}

// +kubebuilder:rbac:groups=core,resources=namespaces,verbs="*"
Expand Down Expand Up @@ -134,16 +136,9 @@ func (r *ProfileReconciler) Reconcile(ctx context.Context, request ctrl.Request)
Name: instance.Name,
},
}

// Set istio-injection label based on service mesh mode
if r.ServiceMeshMode == "ambient" {
// In ambient mode, disable sidecar injection but enable ambient mesh
ns.Labels[istioInjectionLabel] = "disabled"
ns.Labels["istio.io/dataplane-mode"] = "ambient"
} else {
// In sidecar mode (default), inject istio sidecar to all pods in target namespace
ns.Labels[istioInjectionLabel] = "enabled"
}

// Set service mesh labels based on mode
r.setServiceMeshLabels(ns, instance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a follow up to #127 (comment), can't this and setNamespaceLabels be one call? I don't see the benefit in doing two calls in the code to set namespace labels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created setNamespaceLabelsAndServiceMesh() that handles both default labels and service mesh labels in one call

setNamespaceLabels(ns, defaultKubeflowNamespaceLabels)
logger.Info("List of labels to be added to namespace", "labels", ns.Labels)
if err := controllerutil.SetControllerReference(instance, ns, r.Scheme); err != nil {
Expand Down Expand Up @@ -188,19 +183,10 @@ func (r *ProfileReconciler) Reconcile(ctx context.Context, request ctrl.Request)
for k, v := range foundNs.Labels {
oldLabels[k] = v
}

// Apply service mesh mode labels to existing namespace
if r.ServiceMeshMode == "ambient" {
// In ambient mode, disable sidecar injection but enable ambient mesh
foundNs.Labels[istioInjectionLabel] = "disabled"
foundNs.Labels["istio.io/dataplane-mode"] = "ambient"
} else {
// In sidecar mode (default), inject istio sidecar to all pods in target namespace
foundNs.Labels[istioInjectionLabel] = "enabled"
// Remove ambient mode label if it exists
delete(foundNs.Labels, "istio.io/dataplane-mode")
}

r.setServiceMeshLabels(foundNs, instance)

setNamespaceLabels(foundNs, defaultKubeflowNamespaceLabels)
logger.Info("List of labels to be added to found namespace", "labels", foundNs.Labels)
if !reflect.DeepEqual(oldLabels, foundNs.Labels) {
Expand Down Expand Up @@ -228,6 +214,23 @@ func (r *ProfileReconciler) Reconcile(ctx context.Context, request ctrl.Request)
return reconcile.Result{}, err
}

// Create waypoint and L4 AuthorizationPolicy in ambient mode
if r.ServiceMeshMode == "istio-ambient" {
if r.CreateWaypoint {
if err = r.createWaypoint(instance); err != nil {
logger.Error(err, "error creating waypoint", "namespace", instance.Name)
IncRequestErrorCounter("error creating waypoint", SEVERITY_MAJOR)
return reconcile.Result{}, err
}
}

if err = r.updateL4AuthorizationPolicy(instance); err != nil {
logger.Error(err, "error updating L4 AuthorizationPolicy", "namespace", instance.Name)
IncRequestErrorCounter("error updating L4 AuthorizationPolicy", SEVERITY_MAJOR)
return reconcile.Result{}, err
}
}

// Update service accounts
// Create service account "default-editor" in target namespace.
// "default-editor" would have kubeflowEdit permission: edit all resources in target namespace except rbac.
Expand Down Expand Up @@ -439,7 +442,7 @@ func (r *ProfileReconciler) SetupWithManager(mgr ctrl.Manager) error {
return nil
}

func (r *ProfileReconciler) getAuthorizationPolicy(profileIns *profilev1.Profile) istioSecurity.AuthorizationPolicy {
func (r *ProfileReconciler) getAuthorizationPolicy(profileIns *profilev1.Profile) *istioSecurity.AuthorizationPolicy {
nbControllerPrincipal := GetEnvDefault(
"NOTEBOOK_CONTROLLER_PRINCIPAL",
"cluster.local/ns/kubeflow/sa/notebook-controller-service-account")
Expand All @@ -452,7 +455,7 @@ func (r *ProfileReconciler) getAuthorizationPolicy(profileIns *profilev1.Profile
"KFP_UI_PRINCIPAL",
"cluster.local/ns/kubeflow/sa/ml-pipeline-ui")

return istioSecurity.AuthorizationPolicy{
policy := istioSecurity.AuthorizationPolicy{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted to direct return since no policy modifications were actually needed

Action: istioSecurity.AuthorizationPolicy_ALLOW,
// Empty selector == match all workloads in namespace
Selector: nil,
Expand Down Expand Up @@ -524,6 +527,15 @@ func (r *ProfileReconciler) getAuthorizationPolicy(profileIns *profilev1.Profile
},
},
}

// In ambient mode, we still use selector but target the waypoint workload
// TODO: Once Istio supports targetRef in AuthorizationPolicy, update this
if r.ServiceMeshMode == "istio-ambient" {
// For now, keep the selector-based approach for ambient mode
// The waypoint will be created separately and policies will apply to namespace workloads
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Istio in ambient mode already supports the use of targetRef. That's how it can be bound to all traffic passing through the whole namespace https://github.com/orfeas-k/kubeflow-istio-ambient-demo/blob/main/use-case-c-programmatic-access-using-k8s-token/ns-owner-access-istio-ap-l7-waypoint.yaml#L38-L41.

}

return &policy
}

// updateIstioAuthorizationPolicy create or update Istio AuthorizationPolicy
Expand All @@ -538,7 +550,7 @@ func (r *ProfileReconciler) updateIstioAuthorizationPolicy(profileIns *profilev1
Name: AUTHZPOLICYISTIO,
Namespace: profileIns.Name,
},
Spec: r.getAuthorizationPolicy(profileIns),
Spec: *r.getAuthorizationPolicy(profileIns),
}

if err := controllerutil.SetControllerReference(profileIns, istioAuth, r.Scheme); err != nil {
Expand Down Expand Up @@ -774,6 +786,35 @@ func removeString(slice []string, s string) (result []string) {
return
}

// setServiceMeshLabels sets the appropriate service mesh labels based on the mode
func (r *ProfileReconciler) setServiceMeshLabels(ns *corev1.Namespace, profileIns *profilev1.Profile) {
if ns.Labels == nil {
ns.Labels = make(map[string]string)
}

if r.ServiceMeshMode == "istio-ambient" {
// In ambient mode, disable sidecar injection but enable ambient mesh
ns.Labels[istioInjectionLabel] = "disabled"
ns.Labels["istio.io/dataplane-mode"] = "ambient"
// Add waypoint labels for ambient mode
waypointNamespace := r.WaypointNamespace
if waypointNamespace == "" {
waypointNamespace = profileIns.Name
}
ns.Labels["istio.io/use-waypoint"] = r.WaypointName
ns.Labels["istio.io/use-waypoint-namespace"] = waypointNamespace
ns.Labels["istio.io/ingress-use-waypoint"] = "true"
} else {
// In sidecar mode (default), inject istio sidecar to all pods in target namespace
ns.Labels[istioInjectionLabel] = "enabled"
// Remove ambient mode labels if they exist
delete(ns.Labels, "istio.io/dataplane-mode")
delete(ns.Labels, "istio.io/use-waypoint")
delete(ns.Labels, "istio.io/use-waypoint-namespace")
delete(ns.Labels, "istio.io/ingress-use-waypoint")
Comment on lines +816 to +819
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, kudos!

}
}

func setNamespaceLabels(ns *corev1.Namespace, newLabels map[string]string) {
if ns.Labels == nil {
ns.Labels = make(map[string]string)
Expand Down Expand Up @@ -812,6 +853,134 @@ func (r *ProfileReconciler) readDefaultLabelsFromFile(path string) map[string]st
return labels
}

// createWaypoint creates a waypoint proxy in the profile namespace for ambient mode
func (r *ProfileReconciler) createWaypoint(profileIns *profilev1.Profile) error {
logger := r.Log.WithValues("profile", profileIns.Name)

waypointNamespace := r.WaypointNamespace
if waypointNamespace == "" {
waypointNamespace = profileIns.Name
}

// Create waypoint using Gateway API with waypoint gateway class
// This creates an Istio waypoint proxy that handles L7 policies in ambient mode
gatewayClassName := "istio-waypoint"

waypoint := &gatewayv1beta1.Gateway{
ObjectMeta: metav1.ObjectMeta{
Name: r.WaypointName,
Namespace: waypointNamespace,
Labels: map[string]string{
"gateway.istio.io/managed": "Istio",
},
},
Spec: gatewayv1beta1.GatewaySpec{
GatewayClassName: gatewayv1beta1.ObjectName(gatewayClassName),
Listeners: []gatewayv1beta1.Listener{
{
Name: "mesh",
Port: 15008,
Protocol: "HBONE",
},
},
},
}

if err := controllerutil.SetControllerReference(profileIns, waypoint, r.Scheme); err != nil {
return err
}

// Check if the waypoint already exists
foundWaypoint := &gatewayv1beta1.Gateway{}
err := r.Get(context.TODO(), types.NamespacedName{Name: waypoint.Name, Namespace: waypoint.Namespace}, foundWaypoint)
if err != nil {
if apierrors.IsNotFound(err) {
logger.Info("Creating waypoint", "waypoint", waypoint.Name, "namespace", waypoint.Namespace)
err = r.Create(context.TODO(), waypoint)
if err != nil {
return fmt.Errorf("failed to create waypoint: %w", err)
}
} else {
return fmt.Errorf("failed to get waypoint: %w", err)
}
} else {
// Waypoint already exists, check if update is needed
if !reflect.DeepEqual(waypoint.Spec, foundWaypoint.Spec) {
logger.Info("Updating waypoint", "waypoint", waypoint.Name, "namespace", waypoint.Namespace)
foundWaypoint.Spec = waypoint.Spec
err = r.Update(context.TODO(), foundWaypoint)
if err != nil {
return fmt.Errorf("failed to update waypoint: %w", err)
}
}
}

logger.Info("Waypoint reconciled successfully", "waypoint", r.WaypointName, "namespace", waypointNamespace)
return nil
}

// updateL4AuthorizationPolicy creates L4 AuthorizationPolicy to allow traffic from waypoint to services
func (r *ProfileReconciler) updateL4AuthorizationPolicy(profileIns *profilev1.Profile) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job, this looks good!

logger := r.Log.WithValues("profile", profileIns.Name)

waypointNamespace := r.WaypointNamespace
if waypointNamespace == "" {
waypointNamespace = profileIns.Name
}

waypointPrincipal := fmt.Sprintf("cluster.local/ns/%s/sa/%s", waypointNamespace, r.WaypointName)

l4Policy := &istioSecurityClient.AuthorizationPolicy{
ObjectMeta: metav1.ObjectMeta{
Name: "waypoint-l4-access",
Namespace: profileIns.Name,
},
Spec: istioSecurity.AuthorizationPolicy{
Action: istioSecurity.AuthorizationPolicy_ALLOW,
Selector: nil, // Match all workloads in namespace
Rules: []*istioSecurity.Rule{
{
From: []*istioSecurity.Rule_From{
{
Source: &istioSecurity.Source{
Principals: []string{waypointPrincipal},
},
},
},
},
},
},
}

if err := controllerutil.SetControllerReference(profileIns, l4Policy, r.Scheme); err != nil {
return err
}

foundL4Policy := &istioSecurityClient.AuthorizationPolicy{}
err := r.Get(context.TODO(), types.NamespacedName{Name: l4Policy.Name, Namespace: l4Policy.Namespace}, foundL4Policy)
if err != nil {
if apierrors.IsNotFound(err) {
logger.Info("Creating L4 AuthorizationPolicy", "namespace", l4Policy.Namespace, "name", l4Policy.Name)
err = r.Create(context.TODO(), l4Policy)
if err != nil {
return err
}
} else {
return err
}
} else {
if !reflect.DeepEqual(*l4Policy.Spec.DeepCopy(), *foundL4Policy.Spec.DeepCopy()) {
foundL4Policy.Spec = *l4Policy.Spec.DeepCopy()
logger.Info("Updating L4 AuthorizationPolicy", "namespace", l4Policy.Namespace, "name", l4Policy.Name)
err = r.Update(context.TODO(), foundL4Policy)
if err != nil {
return err
}
}
}
return nil
}

func GetEnvDefault(variable string, defaultVal string) string {
envVar := os.Getenv(variable)
if len(envVar) == 0 {
Expand Down
Loading
Loading