Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3866 nftables kube-proxy to GA #5044

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions keps/prod-readiness/sig-network/3866.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,5 @@ alpha:
approver: "@wojtek-t"
beta:
approver: "@wojtek-t"
stable:
approver: "@wojtek-t"
48 changes: 4 additions & 44 deletions keps/sig-network/3866-nftables-proxy/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1206,7 +1206,8 @@ create their own table, and not interfere with anyone else's tables.
If we document the `priority` values we use to connect to each
nftables hook, then admins and third party developers should be able
to reliably process packets before or after kube-proxy, without
needing to modify kube-proxy's chains/rules.
needing to modify kube-proxy's chains/rules. (As of 1.33, this is now
documented.)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


In cases where administrators want to insert rules into the middle of
particular service or endpoint chains, we would have the same problem
Expand All @@ -1224,16 +1225,6 @@ probably makes sense to leave these out initially and see if people
actually do need them, or if creating rules in another table is
sufficient.

```
<<[UNRESOLVED external rule integration API ]>>

Tigera is currently working on implementing nftables support in
Calico, so hopefully by 1.32 we should have a good idea of what
guarantees it needs from nftables kube-proxy.

<<[/UNRESOLVED]>>
```

#### Rule monitoring

Given the constraints of the iptables API, it would be extremely
Expand Down Expand Up @@ -1425,18 +1416,6 @@ We will eventually need e2e tests for switching between `iptables` and
[It should recreate its iptables rules if they are deleted]: https://github.com/kubernetes/kubernetes/blob/v1.27.0/test/e2e/network/networking.go#L550
[`TestUnderTemporaryNetworkFailure`]: https://github.com/kubernetes/kubernetes/blob/v1.27.0-alpha.2/test/e2e/framework/network/utils.go#L1078

<!--
This question should be filled when targeting a release.
For Alpha, describe what tests will be added to ensure proper quality of the enhancement.

For Beta and GA, add links to added tests together with links to k8s-triage for those tests:
https://storage.googleapis.com/k8s-triage/index.html

We expect no non-infra related flakes in the last month as a GA graduation criteria.
-->

- <test>: <link to test coverage>

#### Scalability & Performance tests

We have an [nftables scalability job]. Initial performance is fine; we
Expand Down Expand Up @@ -1635,14 +1614,7 @@ provide more information.

###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?

TBD; we plan to add an e2e job to test switching from `iptables` mode
to `nftables` mode in 1.31.

<!--
Describe manual testing that was done and the outcomes.
Longer term, we may want to require automated upgrade/rollback tests, but we
are missing a bunch of machinery and tooling and can't do that now.
-->
Tested by hand.

###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?

Expand Down Expand Up @@ -1774,26 +1746,14 @@ processed until the apiserver is available again.

###### What are other known failure modes?

<!--
For each of them, fill in the following information by copying the below template:
- [Failure mode brief description]
- Detection: How can it be detected via metrics? Stated another way:
how can an operator troubleshoot without logging into a master or worker node?
- Mitigations: What can be done to stop the bleeding, especially for already
running user workloads?
- Diagnostics: What are the useful log messages and their required logging
levels that could help debug the issue?
Not required until feature graduated to beta.
- Testing: Are there any tests for failure mode? If not, describe why.
-->

###### What steps should be taken if SLOs are not being met to determine the problem?

## Implementation History

- Initial proposal: 2023-02-01
- Merged: 2023-10-06
- Updates for beta: 2024-05-24
- Updates for GA: 2025-01-15

## Drawbacks

Expand Down
6 changes: 3 additions & 3 deletions keps/sig-network/3866-nftables-proxy/kep.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ kep-number: 3866
authors:
- "@danwinship"
owning-sig: sig-network
status: implementable
status: implemented
creation-date: 2023-02-01
reviewers:
- "@thockin"
Expand All @@ -13,12 +13,12 @@ approvers:
- "@thockin"

# The target maturity stage in the current dev cycle for this KEP.
stage: beta
stage: stable

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v1.31"
latest-milestone: "v1.33"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
Expand Down