Skip to content

Conversation

@kimwnasptd
Copy link
Member

Closes #155

@kimwnasptd
Copy link
Member Author

/cc @juliusvonkohout @andyatmiami

- service-account.yaml
- service.yaml
- configmap.yaml
- network-policy.yaml
Copy link
Contributor

@andyatmiami andyatmiami Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are starting to move network policies to within components (from kubeflow/manifests where these definitions reside today) - what should we do about the https://github.com/kubeflow/manifests/blob/master/common/networkpolicies/base/default-allow-same-namespace.yaml file?

Seems like we should a networkpolicy to allow traffic within the kubeflow namespace also defined somewhere in this repo (?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good question!

My proposal is that every repo should own the NetworkPolicies of its components, and the kubeflow/manifests should own resources that should live in the kubeflow namespace, and don't target/configure a workload that is owned by another repo.

But this should have a dedicated issue (I'm trying to create one, but GH doesn't allow me to create it!)
image

So for this PR, I'd suggest we only copy the ones that are specific to components of this repo. And continue the discussion about "generic" resources in kubeflow to kubeflow/manifests, as this can be generalised to other resources.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly orthogonal, we could also create a NetworkPolicy that allows only the dashboard to talk to access-management, so that we don't rely at all in the NetworkPolicy in the kubeflow namespace.

But was thinking of not introducing a new functionality outside of the overall one yet, as I would treat it as a dedicated feature. But if you feel strongly otherwise let me know @andyatmiami @juliusvonkohout

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am comfortable with that response - appreciate the follow up!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, for now focusing on "feature parity" I think is good ... so we can deal with adding new functionality once the general release process has been vetted (imho)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created also kubeflow/manifests#3261 to track the discussion for the policies

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Slightly orthogonal, we could also create a NetworkPolicy that allows only the dashboard to talk to access-management, so that we don't rely at all in the NetworkPolicy in the kubeflow namespace." Yes that should be done and you also need to add tests to verify that the networkpolicy blocks it.

@kimwnasptd kimwnasptd force-pushed the feat-dashboard-networkpolicy branch from 7c91ac1 to 904eff7 Compare October 22, 2025 19:57
@andyatmiami
Copy link
Contributor

/ok-to-test

Comment on lines 42 to 50

# test the NetworkPolicy, by ensuring other Pods timeout talking to the dashboard
OUTPUT=$(kubectl run \
netshoot-test --rm -i \
--restart=Never \
--image nicolaka/netshoot \
-- curl -s dashboard.kubeflow.svc --connect-timeout 5 \
2>&1)
echo $OUTPUT | grep "Connection timed out after"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyatmiami I've also added a small test to ensure other Pods can't reach Access Management.

One realisation I'm having is that this test_service.sh file seems to keep aggregating tests for each component. I would be more in favour of each repo to own its testing code, and potentially split the test code to scripts in the component folders. We don't have to do this now, but mentioned this to get some feedback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i think that would be a great idea..

test_service.sh is (imho) more impediment than helper...

i'd be in favor of:

  • breaking it up into respective components
  • further breaking it up to not have N different pieces of functionality in one script (i.e. a "command" in test-service means we have N "scripts" in one uber script - and i'm not sure that is really helpful

if there is truly generic and/or cross-cutting functionality - that can be its own script to run standalone or source into a give component script - but i think it would make more sense to split the test code for sure - and myself would be cool doing it aggressively 😈

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Fixed the commit of integration tests, should be ready for a pass)

@kimwnasptd kimwnasptd force-pushed the feat-dashboard-networkpolicy branch from 0e33fb3 to 90afa57 Compare October 23, 2025 06:25
Copy link
Contributor

@andyatmiami andyatmiami left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Appreciate this contribution to getting us closer to a release @kimwnasptd !

Verified the checks running on the PR are now sufficient in validating the changes

image

Also manually verified this additional check invocation demonstrates the necessary behavior of the NetworkPolicy

Original behavior:

$ kubectl run  netshoot-test --rm -i  --restart=Never --image nicolaka/netshoot                -- curl dashboard.kubeflow.svc --connect-timeout 5
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    19  100    19    0     0   2079      0 --:--:-- --:--:-- --:--:--  2375
RBAC: access deniedpod "netshoot-test" deleted

With fix:

$ kubectl run \
                    netshoot-test --rm -i \
                    --restart=Never \
                    --image nicolaka/netshoot \
                    -- curl dashboard.kubeflow.svc --connect-timeout 5
If you don't see a command prompt, try pressing enter.
  0     0    0     0    0     0      0      0 --:--:--  0:00:05 --:--:--     0
curl: (28) Connection timed out after 5003 milliseconds
pod "netshoot-test" deleted
pod default/netshoot-test terminated (Error)

ℹ️ I also confirmed similar verification of the dashboard-angular NetworkPolicy but excluding output for sake of brevity.

@kimwnasptd
Copy link
Member Author

Thanks for the thorough review yet again @andyatmiami!

/approve

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kimwnasptd

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit d8e5000 into kubeflow:main Oct 23, 2025
14 checks passed
@kimwnasptd kimwnasptd deleted the feat-dashboard-networkpolicy branch October 23, 2025 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move NetworkPolicies of dashboard components to this repo

3 participants