Skip to content

OSDOCS-14472: CPMS custom machine name prefixes #93119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

skopacz1
Copy link
Contributor

@skopacz1 skopacz1 commented May 7, 2025

OSDOCS-14472

Version(s): 4.19+

This PR adds a feature to add custom prefixes to machine names in control plane machine sets.

QE review:

  • QE has approved this change.

Preview: Control plane machine set configuration options

@openshift-ci-robot
Copy link

openshift-ci-robot commented May 7, 2025

@skopacz1: This pull request references OSDOCS-14472 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

OSDOCS-14472

Version(s): 4.19+

This PR adds a feature to add custom prefixes to machine names in control plane machine sets.

QE review:

  • QE has approved this change.

Preview:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 7, 2025
@openshift-ci openshift-ci bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 7, 2025
@ocpdocs-previewbot
Copy link

ocpdocs-previewbot commented May 7, 2025

@openshift-ci-robot
Copy link

openshift-ci-robot commented May 8, 2025

@skopacz1: This pull request references OSDOCS-14472 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

OSDOCS-14472

Version(s): 4.19+

This PR adds a feature to add custom prefixes to machine names in control plane machine sets.

QE review:

  • QE has approved this change.

Preview: Control plane machine set configuration options

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a really generic (and perhaps not very useful) intro to the new section below. Please feel free to suggest any other wording.


. Save your changes to the `ControlPlaneMachineSet` CR.

. If you are using an `OnDelete` update strategy for your control plane machine set: Manually apply the prefix to the name of any existing machines.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are verification steps to add, please let me know.

For example, what command to run to check machine names, and example output of that command

Comment on lines 34 to 37
** The name contains no more than 253 characters
** The name contains only lowercase alphanumeric characters, `-`, or `.`
** The name starts and ends with an alphanumeric character

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation needs some modification. Please have a look at https://github.com/openshift/api/blob/7318813e48da641a8b2e068018a3723a5cc34d5e/machine/v1/types_controlplanemachineset.go#L45-L61 details.

	// It must be a lowercase RFC 1123 subdomain, consisting of lowercase
	// alphanumeric characters, hyphens ('-'), and periods ('.').
	// Each block, separated by periods, must start and end with an alphanumeric character.
	// Hyphens are not allowed at the start or end of a block, and consecutive periods are not permitted.
	// The prefix must be between 1 and 245 characters in length.

Copy link
Contributor Author

@skopacz1 skopacz1 May 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rewrite this to basically say exactly what this text says is a requirement


. Save your changes to the `ControlPlaneMachineSet` CR.

. If you are using an `OnDelete` update strategy for your control plane machine set: Manually apply the prefix to the name of any existing machines.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manually apply the prefix to the name of any existing machines.

I don't think there is any different way to apply the prefix for OnDelete or RollingUpdate update strategy. In both the cases, we need to modify .spec.machineNamePrefix field.

  • For OnDelete : We need to delete an existing machine; then only a new replacement machine will be created with the prefixed name.
  • For RollingUpdate: We need to force a rollout, maybe by adding some labels to the machine template metadata.

/cc @huali9, could you please help with some user-facing commands and outputs which can be added as an example? TiA!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For both OnDelete and RollingUpdate, we can delete an existing machine, then a new replacement machine will be created with the prefixed name.
For RollingUpdate, we can also add some labels to the master machine template metadata to force a rollout. for example, under a master machine's .spec.providerSpec.value.metadata, add a label like

      metadata:
        labels:
          old: old

then a new replacement machine will be created with the prefixed name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add this extra step to add labels, specifically for people using RollingUpdate? Or should we just tell users to delete their old machines no matter which strategy they're using, and assume that people who use RollingUpdate should already be familiar with forcing rollouts?

I'm happy to document whichever option you guys prefer!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to add this extra step to add labels, specifically for people using RollingUpdate? Or should we just tell users to delete their old machines no matter which strategy they're using, and assume that people who use RollingUpdate should already be familiar with forcing rollouts?

I prefer the latter. Because even for adding labels when people using RollingUpdate, it's the same work as deleting old machines, people have to do that one by one manually, and have to wait until a new master machine get Running and the old master machine disappeared, and then do the same operation on the next old master machine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case of RollingUpdate strategy, any trigger (for example, adding a label in the providerSpec of ControlPlaneMachineSet) should replace the old machines one by one without manually removing them.

However, in the case of OnDelete strategy, replacement machines will only be created in the event of the deletion of the existing machine.

Copy link
Contributor

@jeana-redhat jeana-redhat May 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huali9 and @chiragkyal I would vote that we use the same construct I have elsewhere. We typically don't specifically tell users to delete a machine if they don't have to.

My reasoning here is that I think after you save the config, you should be prepared to see one of the following behaviors:

  • RollingUpdate (default): CMPS goes off and updates the machines. By saying this, users aren't surprised to see it doing something if they didn't know/forgot how CPMS is configured on the cluster.
  • OnDelete: Nothing happens w/o your action. Again, no surprises for the user, and tells them how to make it change the machines.
    It might also be that the user wants to make some other changes, so they might do a few things and THEN delete machines to kick it off. So again I don't wanna say "you must do this right now" since that's not true as far as I understand.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jeana-redhat Your suggestion basically LGTM.
But it should be noted that even though it is RollingUpdate, only changing machineNamePrefix will not trigger an update. Other changes are required to trigger an update. Like @chiragkyal mentioned adding a label in the providerSpec of ControlPlaneMachineSet should replace the old machines one by one without manually removing them. But if users don't want to change other values in the ControlPlaneMachineSet except machineNamePrefix, then they need to delete the old master machines one by one or add labels to the old master machines one by one to trigger the replacement.
Refer https://issues.redhat.com/browse/OAPE-22?focusedId=26363185&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-26363185

-n openshift-machine-api
----

. Add the following snippet to the `ControlPlaneMachineSet` CR:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a snippet? I am not sure if this the right wording here. The intention would be to edit .spec.machineNamePrefix field of ControlPlaneMachineSet CR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the .spec.machineNamePrefix field already be specified in all ControlPlaneMachineSet CRs for 4.19+ clusters, and users are free to change the value from the default?

Or do users need to add the field to the CR the first time they are trying to customize the prefix?

For now I will change the wording to match what you are suggesting. But if they need to add this field the first time, maybe it should be worded like "add the .spec.machineNamePrefix field to the ControlPlaneMachineSet CR."

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the .spec.machineNamePrefix field already be specified in all ControlPlaneMachineSet CRs for 4.19+ clusters, and users are free to change the value from the default?

No, there is no .spec.machineNamePrefix field in ControlPlaneMachineSet CRs by default.

Or do users need to add the field to the CR the first time they are trying to customize the prefix?

So right, users need to add the field to the CR the first time they are trying to customize the prefix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, .spec.machineNamePrefix field is added in the API by default for 4.19+ (oc explain command can help here). Since the type of the field is string; it's defaulted to empty string and not showing up in the yaml.

So I am fine using whichever wording makes sense: add or edit

= Adding a custom prefix to control plane machine names

You can customize the prefix of machine names created by a control plane machine set.
This can be done by editing a control plane machine set custom resource (CR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This can be done by editing a control plane machine set custom resource (CR)
This can be done by editing the control plane machine set custom resource (CR)

[id="cpmso-config-prefix_{context}"]
= Adding a custom prefix to control plane machine names

You can customize the prefix of machine names created by a control plane machine set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
You can customize the prefix of machine names created by a control plane machine set.
You can customize the prefix of machine names created by the control plane machine set.


// Saw the step below in another procedure, let me know if it's good to keep it in here.

. Edit your control plane machine set CR by running the following command:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which one is the right way to write this?

  • control plane machine set
  • control plane machine set CR
  • control plane machine set custom resource (CR)
  • ControlPlaneMachineSet CR

I am seeing all these are being used interchangeably.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am covering this feature for @jeana-redhat so I will let them weigh in (or tag anyone else who might weigh in). I am happy to go through this PR and change any variations in my added docs to any single term.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please cross-verify within your team for the right usage here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "control plane machine set" is defined by a "control plane machine set custom resource (CR)" which has the kind value of ControlPlaneMachineSet. So, in answer to the list above:

  • control plane machine set
    • this is valid, but here we are talking about editing the YAML CR so I would probably lean towards one of the others.
  • control plane machine set CR
    • this is valid and works here. In the repo, I see it 36 times.
  • control plane machine set custom resource (CR)
    • this is just how you are always supposed to write "CR" on first appearance. Once you define an abbreviation, you don't keep doing so. This is correct for line 10 above where it appears, afterwards it should be CR.
  • ControlPlaneMachineSet CR
    • this is valid and works here. In the repo, I see it 32 times

So, apparently I have been pretty evenly inconsistent on using "control plane machine set" and "ControlPlaneMachineSet" in front of custom resource/CR. For tangentially related docs reasons, I think we should prefer to write it as ControlPlaneMachineSet to avoid confusion with a "compute machine set" which is just MachineSet. I'll leave comments on this PR with my recommendations.

I will sweep the repo to change "control plane machine set CR" to "ControlPlaneMachineSet CR" at some point.


.Procedure

// Saw the step below in another procedure, let me know if it's good to keep it in here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think the step below is valid to keep (which I am guessing is the case), then I'll go ahead and delete this comment.

@chiragkyal
Copy link
Member

/cc @huali9

@openshift-ci openshift-ci bot requested a review from huali9 May 9, 2025 08:13
Copy link
Contributor

@jeana-redhat jeana-redhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Seb for getting to this and Chirag for asking about the terms!


// Saw the step below in another procedure, let me know if it's good to keep it in here.

. Edit your control plane machine set CR by running the following command:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "control plane machine set" is defined by a "control plane machine set custom resource (CR)" which has the kind value of ControlPlaneMachineSet. So, in answer to the list above:

  • control plane machine set
    • this is valid, but here we are talking about editing the YAML CR so I would probably lean towards one of the others.
  • control plane machine set CR
    • this is valid and works here. In the repo, I see it 36 times.
  • control plane machine set custom resource (CR)
    • this is just how you are always supposed to write "CR" on first appearance. Once you define an abbreviation, you don't keep doing so. This is correct for line 10 above where it appears, afterwards it should be CR.
  • ControlPlaneMachineSet CR
    • this is valid and works here. In the repo, I see it 32 times

So, apparently I have been pretty evenly inconsistent on using "control plane machine set" and "ControlPlaneMachineSet" in front of custom resource/CR. For tangentially related docs reasons, I think we should prefer to write it as ControlPlaneMachineSet to avoid confusion with a "compute machine set" which is just MachineSet. I'll leave comments on this PR with my recommendations.

I will sweep the repo to change "control plane machine set CR" to "ControlPlaneMachineSet CR" at some point.

Comment on lines +26 to +27
spec:
machineNamePrefix: <machine_prefix>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing some Required fields here and ... to show there's more unwritten:

Suggested change
spec:
machineNamePrefix: <machine_prefix>
apiVersion: machine.openshift.io/v1
kind: ControlPlaneMachineSet
metadata:
name: cluster
namespace: openshift-machine-api
spec:
machineNamePrefix: <machine_prefix>
# ...

Comment on lines 39 to 43
. Save your changes to the `ControlPlaneMachineSet` CR.

. Delete your existing machines and wait for replacement machines with the new prefix name to be created.

// For now I am documenting this generic step above, but I can also add a step for RollingUpdate users to force rollouts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used this before, probably just copy it for consistency:

Suggested change
. Save your changes to the `ControlPlaneMachineSet` CR.
. Delete your existing machines and wait for replacement machines with the new prefix name to be created.
// For now I am documenting this generic step above, but I can also add a step for RollingUpdate users to force rollouts.
. Save your changes.
.Next steps
* For clusters that use the default `RollingUpdate` update strategy, the control plane machine set propagates changes to your control plane configuration automatically.
* For clusters that are configured to use the `OnDelete` update strategy, you must replace your control plane machines manually.

Copy link

openshift-ci bot commented May 19, 2025

@skopacz1: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch/enterprise-4.19 jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants