feat: rework of the current POC#18
Conversation
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
|
I like the direction very much! Some additional points:
Everything else I think is a problem for our future selves, and I think this is a sweet spot to start building on top. Thanks! ❤️ |
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
kyledong-suse
left a comment
There was a problem hiding this comment.
Thank you so much for this rework! I think the direction is right, and it's really a very good start.
I tried it locally with the http-client → http-service workload and the generated proposals. I left couple of comments. For the findEgressPeer / findIngressPeer dedupe, I think we can improve it later. I think we can dedupe by peer workload and merge all ports into one rule per peer workload.
We have some overlap, so I checked out your PR, and added a couple things from my branch on top of yours locally. I'll keep playing with it.
- promotion for
NetworkPolicyProposal→networkingv1.NetworkPolicy - in enforcement_controller reconciler, create/update/delete
networkingv1.NetworkPolicy
| // lastObserved is when flows for this workload were last seen. | ||
| // +optional | ||
| LastObserved *metav1.Time `json:"lastObserved,omitempty"` |
There was a problem hiding this comment.
I'd suggest we keep this LastObserved. It's mainly for debug purpose if there is an unexpected traffic block. We can check the LastObserved timestamp to verify this proposal is learned before or after the block.
There was a problem hiding this comment.
yep that makes sense, thanks!
There was a problem hiding this comment.
on a second though, i believe this could be dangerous. if we update our resource every time we see a flow associated to it there is the risk that we will update the resource every time we scan it. I believe this is the same reason why we didn't add this field on the WorkloadPolicyProposal in the runtime-enforcer. Maybe we can come back to this when we feel it is really needed for debugging reasons
There was a problem hiding this comment.
Ok, I see you point. Yeah, I agree we remove it for now.
I can try to tackle that now
Yep i believe we need to address it but i wouldn't do it now. I would probably add a test with external traffic and implement it.
i will take a look at what we need to fix |
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
a4e497e to
3b2cf16
Compare
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
we don't rely anymore on `reflect.DeepEqual`. For now we do a manual comparsion of the inner fields, maybe in the future we can improve the solution using an hash key or similar approaches Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
Signed-off-by: Andrea Terzolo <andrea.terzolo@suse.com>
| for _, wk := range workloads { | ||
| if !topology.SupportedWorkloadTypes[wk.OwnerKind] { | ||
| continue | ||
| connections := ts.store.DrainFlows() |
There was a problem hiding this comment.
DrainFlows() clears the in-memory egress/ingress before reconcile. If reconcile fails, those workload connections will be lost. Shouldn't we consider to move the "Drain" after reconcile completed successfully?
| if err == nil { | ||
| // Policy already exists, do nothing. | ||
| return ctrl.Result{}, nil |
There was a problem hiding this comment.
Do we want to add a TODO here? So if the proposal spec changes we will update the policy.
This PR is a possible first rework to support K8s network policies syntax.
I would say we can still consider this repo as a POC since we still need to iterate a lot before reaching something production ready
The first commit allows to disable the CNI watcher for quick debugging.
At the moment the design is partial it takes only in consideration pod -> pod connections on the same node.
In the case of pod -> pod on the same node we receive 4 flow observation for each packet exchange
The initial idea is to keep only one of the egress observation (the one between Deployment -> Deployment) of the 4 flows we receive. This won't be enough to handle all cases but could be a good starting point.
To give a concrete example let's consider this workload:
These are the flows we obtain from OBI after a
kubectl exec -n default deployment/http-client -- curl http://http-service:80These are the flow we would keep after the filters
As you can see, we just keep the deployment -> deployment communication without the service in the middle. This is unfortunately only an approximation but for now it could be a good starting point.
In my local setup i obtain these policies
You can try the current implementation with