Skip to content

Bug: Stale nftables rules are not cleaned up if a policy is deleted while the controller daemon is restarting #15

@mlguerrero12

Description

@mlguerrero12

Description

A race condition exists where nftables rules can be orphaned (left behind) on a node if the corresponding policy is deleted while the multi-networkpolicy-nftables daemonset pod on that node is down or restarting.

The new controller pod does not appear to have a mechanism to reconcile the existing rules on the node against the current state in the API server, so it never learns that these stale rules should be deleted.

Expected Behavior / Possible Solutions

A more robust strategy is needed to ensure rule lifecycle is tied to the policy object, even across controller restarts. Two complementary solutions are:

  1. Startup Reconciliation (List-and-Sync) The controller must not rely only on watch events. On startup, each multi-networkpolicy-nftables pod should perform a full reconciliation:

List all MultiNetworkPolicy objects from the API server.

Scan all nftables rules it manages on the node (e.g., in its own chain).

Actively delete any nftables rules that do not correspond to a MultiNetworkPolicy that currently exists in the API server.

  1. Implement Finalizers (Recommended) This is the most robust solution to prevent the race condition from happening in the first place.

Action: The controller should be modified to add a finalizer (e.g., policy.k8s.cni.cncf.io/rules-applied-on-node-A) to any MultiNetworkPolicy object it successfully processes.

How it Solves the Problem:

When a user tries to delete the policy, the API server sees the finalizer and only adds a deletionTimestamp instead of deleting the object immediately. The object enters a "terminating" state.

All running controller pods receive an UPDATE event for this "terminating" object. They can then safely clean up their local nftables rules and remove their finalizer from the object.

Crucially: If the pod on node-A is down, it cannot remove its finalizer. The MultiNetworkPolicy object will remain "stuck" in the "terminating" state.

When the controller pod on node-A restarts, it will see the "terminating" object, perform its cleanup, and remove its finalizer. Only when all finalizers are removed will the API server finally delete the object. This guarantees that cleanup is performed on every node before the policy is removed.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions