Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

H4HIP: Wait with kstatus #374

Merged
merged 18 commits into from
Feb 19, 2025
Prev Previous commit
Next Next commit
updates to new architecture
Signed-off-by: Austin Abro <AustinAbro321@gmail.com>
AustinAbro321 committed Jan 6, 2025
commit c049e80c98ad6c83e9363ad337b061836bf4a704
25 changes: 23 additions & 2 deletions hips/hip-0999.md
Original file line number Diff line number Diff line change
@@ -33,9 +33,30 @@ Kstatus does not output any logs. Helm will output a log message each second sig

A [dynamic rest mapper](https://github.com/kubernetes-sigs/controller-runtime/blob/aea2e32a936584b06ae6f7992f856fe7292b0297/pkg/client/apiutil/restmapper.go#L36) will be used, this way if a chart installs a custom resource definition, kstatus will be able to pick up the custom resources during wait.

Kstatus will not be exposed by the Helm SDK, instead an adapter/interface will be created. If kstatus is ever deprecated, unmaintained, or replaced there should be no impact to the public API of the Helm SDK.
Any functions involving waits will be separated from the `kube.Interface` interface into the `kube.Waiter` interface. The client will embed the `Waiter` interface to allow calls to look like `client.Wait()` instead of `client.waiter.Wait()`. `kube.New()` will accept a wait strategy to decide the wait implementation. Options will be either `HelmWaiter` or `statusWaiter`. `HelmWaiter` is the legacy implementation. The `statusWaiter` will not be public so that is kstatus is ever deprecated or replaced a new implementation can be used without changing the public SDK.

The new client will look like below:
```go
type Client struct {
Factory Factory
Log func(string, ...interface{})
Namespace string
kubeClient *kubernetes.Clientset
Waiter
}
type WaitStrategy int
const (
StatusWaiter WaitStrategy = iota
LegacyWaiter
)
func New(getter genericclioptions.RESTClientGetter, ws WaitStrategy) (*Client, error)
```

The Helm CLI will always use the `statusWaiter` implementation, if this is found to be insufficient during Helm 4 testing a new cli flag `wait-strategy` will be introduced with options `status` and `legacy` to allow usage of the `HelmWaiter`. If the `statusWaiter` is found to be sufficient the `HelmWaiter` will be deleted from the public SDK before Helm 4 is released.

`WaitAndGetCompletedPodPhase` is an exported function that is not called anywhere within the Helm repository. It will be removed.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak for helm maintainers, but it looks like this method is part of their public API, and not deprecated, so I'd be careful with removing it right away.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe since this is targeted at Helm v4 breaking changes in the public API are acceptable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a PR to get rid of this immediately: helm/helm#13665


<!-- TODO design API / interface -->
The current Helm wait logic does not wait for paused deployments to be ready. This is because if `helm upgrade --wait` is run on a chart with paused deployments they will never be ready, see [#5789](https://github.com/helm/helm/pull/5789).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this seems more like motivation that specification

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not changing this in the PR, but I wanted to give the reference to the PR that explains why we wait for paused deployments. Let me make this line more clear


## Backwards compatibility
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any situation where kstatus will not return ready, but existing logic would?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the two called our here, where kstatus will wait to return ready until reconciliation is complete, and waiting for CRDs I am not thinking of any, but I am not 100% sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect, to be on the safe side, we will want/need to support a fallback flag to allow users to fallback to the "legacy" wait/ready code. Ideally we would deprecate that flag e.g. Helm 5 and remove the old code in a future version. Not sure what we would do if the legacy code serves some behavior that kstatus doesn't (ideally this doesn't happen of course), but I think we should allow users the fallback when upgrading to Helm 4 just in case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kstatus seems to be pretty extensible, we can implement a custom status reader and override the behavior for any resource type. I don't think there will have any long term issues