Skip to content

fix(netpol): don't add default block twice for dualstacks#5741

Merged
oilbeater merged 2 commits intokubeovn:masterfrom
SkalaNetworks:fix-netpol-block
Oct 14, 2025
Merged

fix(netpol): don't add default block twice for dualstacks#5741
oilbeater merged 2 commits intokubeovn:masterfrom
SkalaNetworks:fix-netpol-block

Conversation

@SkalaNetworks
Copy link
Copy Markdown
Member

@SkalaNetworks SkalaNetworks commented Sep 20, 2025

Pull Request

What type of this PR

Examples of user facing changes:

  • Bug fixes
  • Tests

When running dualstack, netpols add the "block" default ACL for both protocols (v6/v4), leading to error messages when functions check if there's duplicates.

Instead of adding the ACL each time, we now only do it only once for egress and ingress.

Which issue(s) this PR fixes

Fixes #5736

@SkalaNetworks SkalaNetworks force-pushed the fix-netpol-block branch 3 times, most recently from e7b1630 to d7a783a Compare September 20, 2025 08:34
@coveralls
Copy link
Copy Markdown

coveralls commented Sep 20, 2025

Pull Request Test Coverage Report for Build 18488291491

Details

  • 32 of 55 (58.18%) changed or added relevant lines in 2 files are covered.
  • 1 unchanged line in 1 file lost coverage.
  • Overall coverage decreased (-0.03%) to 21.032%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/ovs/ovn-nb-acl.go 32 37 86.49%
pkg/controller/network_policy.go 0 18 0.0%
Files with Coverage Reduction New Missed Lines %
pkg/ovs/ovn-nb-acl.go 1 79.25%
Totals Coverage Status
Change from base Build 18485416841: -0.03%
Covered Lines: 10689
Relevant Lines: 50822

💛 - Coveralls

@SkalaNetworks SkalaNetworks force-pushed the fix-netpol-block branch 3 times, most recently from 4dec04e to 36da7bf Compare September 20, 2025 10:14
@SkalaNetworks SkalaNetworks marked this pull request as ready for review September 20, 2025 12:15
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working network policy labels Sep 20, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 23, 2025
@oilbeater
Copy link
Copy Markdown
Collaborator

One IPv6 networkpolicy test case keeps failing, please take a look.

@SkalaNetworks
Copy link
Copy Markdown
Member Author

I won't be able to look into this for the next 2 weeks but I'll do it as soon as I get the chance.

It is strange that it only fails for IPv6 and on the underlay, I'll triple check everything.

@SkalaNetworks SkalaNetworks marked this pull request as draft September 23, 2025 05:20
@SkalaNetworks SkalaNetworks marked this pull request as ready for review October 12, 2025 22:12
@SkalaNetworks SkalaNetworks marked this pull request as draft October 12, 2025 22:14
@SkalaNetworks SkalaNetworks force-pushed the fix-netpol-block branch 2 times, most recently from 6da17ee to 36da7bf Compare October 12, 2025 23:48
@SkalaNetworks
Copy link
Copy Markdown
Member Author

SkalaNetworks commented Oct 13, 2025

@oilbeater I don't know what's happening there but I cannot get this to reproduce on my env. Could you try cloning the branch and executing the tests on your side please?

I did confirm that the tests on Github succeed if the policy doesn't exist, so that's definitely the cause. But when running the E2E tests on my side with IPv6 only, the tests are successful.

EDIT: I merged master into the PR and somehow the faulty test is gone, but now netpols don't seem to block anything. Very strange.

@oilbeater
Copy link
Copy Markdown
Collaborator

I will take a look.

@SkalaNetworks
Copy link
Copy Markdown
Member Author

SkalaNetworks commented Oct 13, 2025

Looks like the default drop doesn't get applied anymore if the policy of the NetworkPolicy is not both Egress/Ingress.

I may have found an issue with the check logic that doesn't generate ACL ops if something is already present. I'm trying a fix.

@SkalaNetworks
Copy link
Copy Markdown
Member Author

SkalaNetworks commented Oct 13, 2025

@oilbeater Looking at the test, I don't get how it should even pass (but somehow, locally, it does).

We're creating a network policy to allow only traffic to all IPv4s (0.0.0.0/0). But we're trying to fetch the API server with IPv6.

The description of the test mentions it should run as "hostNetwork"

should be able to access svc with backend host network pod after any other ingress network policy rules created

But the definition of the Pod doesn't have any setting.

EDIT: Looking at the behaviour of the ACLs, there's ACLs auto-added to allow communication between the nodes and the pods to conform with Kubernetes rules (and to allow healthchecks). If the API server runs on one of these nodes (which is the case with kind), the traffic will be let go.

@SkalaNetworks
Copy link
Copy Markdown
Member Author

Is there a way for me to re-launch only one of the steps of the GitHub pipeline? The issue is (for me) only present on the GH env and I'd like to try to reproduce it somehow.

@SkalaNetworks SkalaNetworks force-pushed the fix-netpol-block branch 2 times, most recently from ea5d444 to 8f978ad Compare October 13, 2025 09:22
@oilbeater
Copy link
Copy Markdown
Collaborator

Is there a way for me to re-launch only one of the steps of the GitHub pipeline? The issue is (for me) only present on the GH env and I'd like to try to reproduce it somehow.

Currently, there's no simple way to rerun just one step. You can try removing other E2E steps in /.github/workflows/build-x86-image.yaml first to isolate the issue.

@SkalaNetworks SkalaNetworks force-pushed the fix-netpol-block branch 2 times, most recently from 16acf38 to 8823126 Compare October 13, 2025 10:56
@SkalaNetworks
Copy link
Copy Markdown
Member Author

I'm out of ideas, the tests are flaking and have 0 consistent behaviour.

It's always the underlay that fails, sometimes IPv6, sometimes IPv4, sometimes DualStack. I can't get a single failure on my local env in any of these scenarios.

Could it be that those tests have always been flaky and because I'm modying this part of the code it shows up?

@SkalaNetworks SkalaNetworks force-pushed the fix-netpol-block branch 2 times, most recently from f9f6354 to 2b88733 Compare October 13, 2025 13:19
@SkalaNetworks
Copy link
Copy Markdown
Member Author

it works when hostnetwork is true like the title of the test says, but not without it...

@SkalaNetworks
Copy link
Copy Markdown
Member Author

Looking at the logs of the controller during the GH test

image

It looks like the E2E test is broken somehow when working in underlay mode. The provider network is not found?

@SkalaNetworks SkalaNetworks force-pushed the fix-netpol-block branch 2 times, most recently from ac3a2bd to 7159c72 Compare October 13, 2025 16:56
Signed-off-by: SkalaNetworks <contact@skala.network>
@SkalaNetworks SkalaNetworks force-pushed the fix-netpol-block branch 5 times, most recently from 371154e to edb1080 Compare October 14, 2025 05:29
Signed-off-by: SkalaNetworks <contact@skala.network>
@SkalaNetworks SkalaNetworks marked this pull request as ready for review October 14, 2025 08:09
@oilbeater oilbeater merged commit 7907b51 into kubeovn:master Oct 14, 2025
76 of 77 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer network policy size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Network Policies in DualStack subnets generate duplicate drop rules

3 participants