-
Notifications
You must be signed in to change notification settings - Fork 40.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[evented pleg]: using real-time container events for pod state determination #129355
base: master
Are you sure you want to change the base?
Conversation
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
76d5c41
to
4867c3d
Compare
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
4867c3d
to
9731965
Compare
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
9731965
to
655bb8c
Compare
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
655bb8c
to
d01c1e6
Compare
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
d01c1e6
to
2922d0d
Compare
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
that comment would be applicable everywhere except in |
Thanks for the reminder! |
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
@harche @haircommander Will we be able to move forward and merge this PR within the v1.33 cycle? IMO, if we can merge it earlier, it will allow users and developers to validate the reliability of this feature sooner, and we can also advance to the beta stage more quickly :) |
@HirazawaUi I am trying to test these changes in CRIO CI, cri-o/cri-o#9053, but looks like I might be goofing up in setting it up correctly. Looking into it. |
Thanks! |
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
1 similar comment
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
/hold |
16b9490
to
67e6c17
Compare
/test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
It seems that #130599 made some adjustments to features related to pleg, resulting in the removal of import records. Consequently, after merging the main branch code, this branch also lost those import records, causing compilation failures. |
… modify the determination logic
67e6c17
to
0bb05dd
Compare
The latest submitted code removes unrelated code formatting changes (such as whitespace deletions) and modifies the duration for determining whether a container has started into a constant. /test pull-kubernetes-node-crio-cgrpv1-evented-pleg-e2e |
/retest |
/hold cancel |
What type of PR is this?
/kind bug
Which issue(s) this PR fixes:
Fixes #124704 #121003
Special notes for your reviewer:
The purpose of this PR is to resolve #124704
I have carefully introduced the causes of this problem and various solutions we've attempted to resolve it in https://docs.google.com/document/d/1TPrY56q9MNW8r1FuzKDFkBBhOjQ0hqi7wJAbIP1O-4g/
This PR proposes that during its execution cycle,
podWorkerLoop
should directly use the latest events reported by the container runtime as the current Pod state. If we can confirm that the events reported by the container runtime are reliable and up-to-date, we can stop relying on timestamps to determine whether the state is current. In both Generic PLEG and the current Evented PLEG, if the container state does not change, the timestamps in the cache are still updated unnecessarily (every 5 seconds and 1 second, respectively). This shows that timestamp-based checks are meaningless when the container state remains unchangedThe current container lifecycle code is not yet ready to directly accept the states reported by the container runtime. To address this, the evented PLEG needs to filter out sandbox creation and deletion events, as the container runtime reports the same events for both sandbox containers and regular containers, and kubelet has never handled these issues. Consequently, an additional condition has been added to ensure that sandbox container creation events do not trigger lifecycle processing before regular container creation, and sandbox container deletion events do not trigger lifecycle processing after regular container deletion. However, the cache will still be updated to reflect the latest state.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: