-
Notifications
You must be signed in to change notification settings - Fork 59
[ci] Update pulumi k8s operator to v2 #1017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/hdm_test |
2 similar comments
|
/hdm_test |
|
/hdm_test |
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
62bc31b to
b8c57bf
Compare
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
moritzkiefer-da
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Do you have a link to a successful HDM test?
I am slightly nervous about just replacing this. It seems ideally we would get a few weeks of experience running this in internal clusters instead of just forcibly rolling it out to dev/test/mainnet with the next upgrade.
How hard is it to keep this is an option that we only enable on internal clusters for now?
|
|
||
| # Upgrade workarounds, include a GH issue to remove them once the base version changes | ||
|
|
||
| # TODO(#14679): Remove |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have this stuff at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not make sense indeed, let me clean it up
| namespace: namespaceName, | ||
| }, | ||
| spec: { | ||
| image: `pulumi/pulumi:${semver.gt(pulumiVersion, minimumPulumiVersionRequired) ? pulumiVersion : minimumPulumiVersionRequired}-nonroot`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this still go through our cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that yes as we basically cache everything as it always goes through that proxy. @isegall-da is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, I think only the DinD is configured to use the cache, so that images we pull in "docker pull" in tests are pulled from there.
But I'm not sure we care. How often do we expect this to be re-pulled per cluster? (the rate limit is per IP AFAIK)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We needed the cache for jobs in CI that were pulling multiple images per each CI run, hence hit rate limits on docker.io
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I'm not sure we care. How often do we expect this to be re-pulled per cluster? (the rate limit is per IP AFAIK)
I guess ciperiodic might be the most with one pod per stack? Hopefully still infrequent enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rate limit seems to be 100 per IP address per 6 hours. I think we should be ok. Famous last words...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I think that k8s has some internal cache too, doesn't it? (didn't help with docker pull in DinD, but will for the image for a pod)
I have one where the steps that were touched by this did succeed but the full workflow failed because it was during the switch to splice.
It's actually fairly simple as we can keep the operator deployment on an old reference if needed. Though I wouldn't postpone it more than required tbh. Once it's merged to main updating CILR should provide the feedback we need in a few days of testing. |
fair, I guess we could just revert if needed reasonably easily as well. |
moritzkiefer-da
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
|
/hdm_test |
1 similar comment
|
/hdm_test |
Signed-off-by: Nicu Reut <nicu.reut@digitalasset.com>
| cpu: 5, | ||
| memory: config.optionalEnv('OPERATOR_MEMORY_LIMIT') || '20G', | ||
| cpu: 2, | ||
| memory: config.optionalEnv('OPERATOR_MEMORY_LIMIT') || '4G', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now per stack, which is why we need 80% less, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, will update it as well after we deploy to prod and see actual usage but the main container should be fairly light
isegall-da
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
Pull Request Checklist
Cluster Testing
/cluster_teston this PR to request it, and ping someone with access to the DA-internal system to approve it./hdm_teston this PR to request it, and ping someone with access to the DA-internal system to approve it.PR Guidelines
Fixes #n, and mention issues worked on using#nMerge Guidelines