Prow control plane migration (k8s-prow > k8s-infra-prow) #33350
Closed
Description
Using this to track what's done/to-do, and communicate higher-traffic updates.
Broader updates should go to https://groups.google.com/a/kubernetes.io/g/dev/c/qzNYpcN5la4.
Based on the proposal doc at https://docs.google.com/document/d/1erBhuCwY26d0UfPbzt8lEj6bYT2hOUKzc2j36YHVqfM.
Pre-migration:
-
List and drop prowjobs that will not be migrated (k/t-i#33272) -
Add banner warning people about migration date (Wed, August 21) (Prow + TestGrid, done) - Add tracking issue, and communicate migration progress as it happens
- Track down additional infra in SIG K8s Infra that may be using workload identity (I believe Ben did this during migration)
-
Spin down Boskos use (k/t-i#33129) -
Ban use of default/unspecified cluster (k/t-i#33272) - Prepare a quick "scale all the old controllers to 0 and scale the new ones up" PR
- PR should update the deployment replicas; run the make target to manually deploy
- cd test-infra/config/prow
- make deploy-prow?
- (Rollback is just a revert to a previous Git commit + make target) - Test new Prow with fake configmap (e.g. has a single job w/ "hello world")
-
Ensure that some key people have access to both the Google and community projects-
Ben, Cole, and Michelle should or already have access on both (k8s-prow and k8s-infra-prow)
-
- Prepare for switching Deck over
(not doing) Set up a new domain (e.g. k8s-infra-prow.k8s.io) pointing at the new Deck deployment-
Create Prow certificates (Create prow certificates k8s.io#7194)
Right before migration:
- Drop remaining unmigrated Prow jobs Drop remaining trusted jobs #33352 / removing unmigrated CI jobs ahead of prow control-plane migration #33226
- (ignore, no longer needed) Block changes to Prow and Prowjob config in test-infra (except people working on migration)
- Sync logs from buckets to new buckets
-
gs://kubernetes-jenkins (there's a transfer job running for this bucket to kubernetes-ci-logs)
-
- Scale all the new controllers to 0.
- Sync current configmap with new Prow
During migration:
Begins ~10:30am PT, Wednesday August 21
- Scale down the old Prow
- Copy prowjobs from old to new Prow
-
Trigger a final run of https://console.cloud.google.com/transfer/jobs/transferJobs%2Fkubernetes-jenkins-transfer/runs?project=k8s-infra-prow and delete it. - Scale up the new Prow
- Switch webhooks
- Verify new Deck is working (watch jobs start and finish successfully)
- Update DNS entries Use the new prow endpoint k8s.io#7206
Post-migration:
- Debug and fix the external secrets instance. It seems to be getting stuck and is not syncing secrets to the cluster.
- deploy prow to k8s-infra-prow cluster k8s.io#7141
- Enable autosync for prow control plane k8s.io#7211
- delete pushgateway and update prow announcement #33359
- add the new community prow bucket to gcsweb k8s.io#7207
- Turn down old monitoring stack
- Transfer Kettle and TestGrid to use new logs buckets
- Not part of control plane migration, instead see Migrate tooling off of gs://kubernetes-jenkins #33381
- Start autobumping Community Prow k8s.io#7231
- Turn down old Deck and old CRs
- Delete remaining jobs on old Prow
- Ensure no repositories are registered with k8s-prow
- Delete Prow components in k8s-prow
- Handle logs buckets
- gs://kubernetes-jenkins: See Migrate tooling off of gs://kubernetes-jenkins #33381
-
Remove gs://kubernetes-jenkins-pull - Set a lifecycle policy for the new prow bucket gs://kubernetes-ci-logs
- Remove deprecated code/references (for handling k8s-prow control plane)
- Delete the public IP address used to serve prow
- Reconcile and merge Reconcile all the infra changes for deploying Prow k8s.io#7205
Other resources may need to remain in the k8s-prow project (esp. images)