-
Notifications
You must be signed in to change notification settings - Fork 128
Daemon redesign - using controller-runtime #788
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your PR,
To skip the vendors CIs, Maintainers can use one of:
|
9f5207e to
7d4e946
Compare
e17b584 to
ddc70f2
Compare
929886d to
9377404
Compare
79cc69b to
dc6eb11
Compare
dc6eb11 to
50ad79a
Compare
50ad79a to
f369b65
Compare
zeeke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few minor comments. This PR hugely improves the config-daemon shape.
| c := current.Status.DeepCopy().Interfaces | ||
| d := desiredNodeState.Status.DeepCopy().Interfaces | ||
| for idx := range d { | ||
| // check if it's a new device |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // check if it's a new device | |
| if idx >= len(c) { | |
| return true, nil | |
| } | |
| // check if it's a new device |
I don't know if it can ever happen (maybe if a new device is hot-plugged between reconcile loops). But it's safer to check for slice length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but in theory (not sure if it's possible) you can have a new device like hot-unplug and after that hot-plug so the len will be the same but the device is not no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in both case, the if idx >= len(c) { can save a nil dereferencing. right?
adrianchiris
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partial review
| predicate.Funcs | ||
| } | ||
|
|
||
| func (DrainStateAnnotationPredicate) Create(e event.CreateEvent) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about removing the Create fn ?
the default of predicate.Funcs is true.
is object being nil a real concern here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the concern is not nil. without this when a new node is added to the cluster the reconcile will not get called without this.
and if the reconcile is not created no one adds the labels to make the drain possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without this when a new node is added to the cluster the reconcile will not get called without this
are you sure ? i thought by default it will call reconcile since predicate.Funcs contains implementation that will return true. which means to enqueue event for reconcile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I give it a try and I needed to add it. without it I was not able to get the create calls :(
I can give it another try
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this ended up as needed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep we need this one at least from my tests
| log.Log.Error(err, "nodeStateSyncHandler(): Failed to fetch node state", "name", vars.NodeName) | ||
| return err | ||
| reqLogger.Error(err, "failed to check systemd status unexpected error") | ||
| return ctrl.Result{}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we dont want to fail here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we return an error it will get stuck in a loop as if the result for the systemd operator return an error we don't retry to not put the node into a bootloop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack.
should we update status with "unexpected error" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right nice catch!
|
once we get this in, i think we should move the node controllers under |
bindata/manifests/sriov-config-service/kubernetes/sriov-config-service.yaml
Show resolved
Hide resolved
80564fa to
00587fb
Compare
|
Hi @zeeke @adrianchiris @ykulazhenkov can you please make another round of review here? |
adrianchiris
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall looks great !
added some small comments. once we resolve them we can merge !
174365e to
8768925
Compare
e0ne
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
LGTM. it could be merged once CI pass |
pkg/daemon/daemon.go
Outdated
|
|
||
| loadedPlugins map[string]plugin.VendorPlugin | ||
| lastAppliedGeneration int64 | ||
| disableDrain bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like we never set this value.
| return ctrl.Result{}, err | ||
| } | ||
|
|
||
| // update log level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to make sure that all these global variables are initialized before the NodeReconciler starts.
Currently, FeatureGates are initialized, but not the DisableDrain flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right! done
adrianchiris
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,
can be merged once CI pass and comments from @ykulazhenkov addressed.
2f979de to
d9f11df
Compare
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
It's only a one time run script so it should not be a problem to run with debug logs. Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
Signed-off-by: Sebastian Sch <[email protected]>
a697c43 to
3e6b91c
Compare
|
Thanks for all the help folks! merging |
No description provided.