Skip to content

Improve uniqueness of ClusterControlPlane name#1386

Open
Atish-iaf wants to merge 1 commit intovmware-tanzu:mainfrom
Atish-iaf:unique-ccp-name
Open

Improve uniqueness of ClusterControlPlane name#1386
Atish-iaf wants to merge 1 commit intovmware-tanzu:mainfrom
Atish-iaf:unique-ccp-name

Conversation

@Atish-iaf
Copy link
Contributor

nsx-operator uses the following name pattern when generating cluster-control-plane names.
fmt.Sprintf("%s-%s-%s", s.NSXConfig.CoeConfig.Cluster, namespace, name)
The following two different namespaces and clusters will be in the same name.
namespace: xx, cluster: yy-zz, result: xx-yy-zz
namespace: xx-yy, cluster: zz, result: xx-yy-zz

A solution would be using underscore instead of "-" as delimiter. K8s doesn't allow namespace names to include underscore (ref)

No need to change existing NSXServiceAccount cluster-control-plane node IDs and names. This new name pattern should only apply to new NSXServiceAccount CRs created after this new name pattern is enabled in nsx-operator.

Test summary
Doesn't change status.clusterName in existing NSXServiceAccount CR

apiVersion: nsx.vmware.com/v1alpha1
kind: NSXServiceAccount
metadata:
  creationTimestamp: "2026-03-09T09:35:27Z"
  finalizers:
  - nsxserviceaccount.nsx.vmware.com/finalizer
  generation: 1
  name: cluster-default-antrea
  namespace: antrea-test
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta2
    kind: Cluster
    name: cluster-default
    uid: 2d90023f-db42-4d50-80cb-37016a4400bc
  resourceVersion: "11572735"
  uid: 85334875-8388-4793-94ef-fcaa4cf6c353
spec: {}
status:
  clusterID: 67e7c5b3-c593-4f79-a558-35e6c6a4d481
  clusterName: df1690ff-d89a-4079-a333-9e9ea8ae35db-antrea-test-cluster-default-antrea
  conditions:
  - lastTransitionTime: "2026-03-09T09:35:27Z"
    message: Success.
    observedGeneration: 1
    reason: RealizationSuccess
    status: "True"
    type: Realized
  nsxManagers:
  - 10.161.245.125:443
  phase: realized
  proxyEndpoints: {}
  reason: Success
  secrets:
  - name: cluster-default-antrea-nsx-cert
    namespace: antrea-test
  vpcPath: /orgs/default/projects/df1690ff-d89a-4079-a333-9e9ea8ae35db/vpcs/antrea-test-default-vpc

For new NSXServiceAccount CR, it uses the new pattern _ to join namespace and name instead of - when generating clusterName so that it is unique.

apiVersion: nsx.vmware.com/v1alpha1
kind: NSXServiceAccount
metadata:
  creationTimestamp: "2026-03-09T09:56:29Z"
  finalizers:
  - nsxserviceaccount.nsx.vmware.com/finalizer
  generation: 1
  name: cluster-default-antrea
  namespace: antrea-test
  ownerReferences:
  - apiVersion: cluster.x-k8s.io/v1beta2
    kind: Cluster
    name: cluster-default
    uid: 2d90023f-db42-4d50-80cb-37016a4400bc
  resourceVersion: "11589772"
  uid: df4da937-eaba-49f5-9240-0c5e1d1a1744
spec: {}
status:
  clusterID: 45282b65-6320-460b-bb35-19c4233a6035
  clusterName: df1690ff-d89a-4079-a333-9e9ea8ae35db_antrea-test_cluster-default-antrea
  conditions:
  - lastTransitionTime: "2026-03-09T09:56:41Z"
    message: Success.
    observedGeneration: 1
    reason: RealizationSuccess
    status: "True"
    type: Realized
  nsxManagers:
  - 10.161.245.125:443
  phase: realized
  proxyEndpoints: {}
  reason: Success
  secrets:
  - name: cluster-default-antrea-nsx-cert
    namespace: antrea-test
  vpcPath: /orgs/default/projects/df1690ff-d89a-4079-a333-9e9ea8ae35db/vpcs/antrea-test-default-vpc

@zhengxiexie
Copy link
Contributor

Can one of the admins verify this patch?

@codecov-commenter
Copy link

codecov-commenter commented Mar 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.76%. Comparing base (1e42bf1) to head (33063fe).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #1386   +/-   ##
=======================================
  Coverage   76.76%   76.76%           
=======================================
  Files         151      151           
  Lines       21313    21315    +2     
=======================================
+ Hits        16360    16362    +2     
  Misses       3784     3784           
  Partials     1169     1169           
Flag Coverage Δ
unit-tests 76.76% <100.00%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
pkg/nsx/services/nsxserviceaccount/cluster.go 80.70% <100.00%> (+0.08%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Atish-iaf Atish-iaf force-pushed the unique-ccp-name branch 2 times, most recently from 0fe343c to cc434c2 Compare March 10, 2026 06:15
@Atish-iaf
Copy link
Contributor Author

Hi @edwardbadboy @liu4480
Could you please help to review this patch ?
Thanks!

NSXClient: &nsx.Client{
NsxConfig: &config.NSXOperatorConfig{
CoeConfig: &config.CoeConfig{
Cluster: "k8scl-one:test",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@edwardbadboy I just recalled that we replaced ":" with "_" in normalized name, this might lead to misunderstanding as well, shall we use other character to replace it, for example "--" or "__"

Copy link
Contributor Author

@Atish-iaf Atish-iaf Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does : ever exist in NSXConfig.CoeConfig.Cluster ? It is supervisor_id so it seems no.
So, while normalizing cluster name, the cluster name wouldn't have : and therefore no need to replace : because it won't exists ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util.NormalizeId coverts ":" to "_". Specific to our use case, today supervisor cluster id is a UUID, it doesn't has any ":". Namespace and cluster names cannot have ":", so this is fine.

Besides, if you want to get a conflict, you have to control both part of the names. For example, before this patch, if namespace and cluster names are connected by "-", you can do

  • aaa-bbb + ccc -> aaa-bbb-ccc
  • aaa + bbb-ccc -> aaa-bbb-ccc

You have to construct both namespace name and cluster names.

Supervisor ID is auto generated, not controlled by the customer, so this is fine. For example, you cannot do:

  • SV ID = aaa:bbb , namespace = ccc -> aaa_bbb_ccc
  • SV ID = aaa, namespace = bbb-ccc -> aaa_bbb_ccc
    This is because customer have no control on SV IDs. If a SV ID has ":", then all SV IDs should have ":" in the same place.

util.NormalizeId coverts ":" to "_" may because the SV name (not ID) can be like "domain-c10:uuid". Maybe in previous releases, WCP once configured NCP ConfigMap with this name.

@Atish-iaf Atish-iaf changed the title Improve uniqueness of ClusterClontrolPlane Improve uniqueness of ClusterClontrolPlane name Mar 16, 2026
@edwardbadboy edwardbadboy requested a review from andrew-su March 18, 2026 07:49
func (s *NSXServiceAccountService) DeleteNSXServiceAccount(ctx context.Context, namespacedName types.NamespacedName, uid types.UID) error {
isDeleteSecret := false
nsxsa := &v1alpha1.NSXServiceAccount{}
if err := s.Client.Get(ctx, namespacedName, nsxsa); err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If DeleteNSXServiceAccount is called by NSXServiceAccountReconciler.garbageCollector, the NSXServiceAccount object with the namespacedName may not exist. Here I estimate that it will hit this err != nil code path and *nsxsa may be still a zero value (v1alpha1.NSXServiceAccount{}) after this Client.Get call.

The later s.getClusterName(nsxsa) call should be prepared for this situation.

This situation should not happen in usual case, because the nsx-operator adds a finalizer to NSXSA. NSXSA CR should not be deleted without deregistering. This may happen when NSX is backup-restored. NSXSA was created, NSX was backed up, then NSXSA was deleted, and NSX is restored. During garbage collection, the operator will see that cluster-control-plane resource exists but NSXSA doesn't exist. Then DeleteNSXServiceAccount is called with a namespacedName pointing to this non-existent NSXSA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @edwardbadboy , updated to handle this situation, please help to review again.

NSXClient: &nsx.Client{
NsxConfig: &config.NSXOperatorConfig{
CoeConfig: &config.CoeConfig{
Cluster: "k8scl-one:test",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

util.NormalizeId coverts ":" to "_". Specific to our use case, today supervisor cluster id is a UUID, it doesn't has any ":". Namespace and cluster names cannot have ":", so this is fine.

Besides, if you want to get a conflict, you have to control both part of the names. For example, before this patch, if namespace and cluster names are connected by "-", you can do

  • aaa-bbb + ccc -> aaa-bbb-ccc
  • aaa + bbb-ccc -> aaa-bbb-ccc

You have to construct both namespace name and cluster names.

Supervisor ID is auto generated, not controlled by the customer, so this is fine. For example, you cannot do:

  • SV ID = aaa:bbb , namespace = ccc -> aaa_bbb_ccc
  • SV ID = aaa, namespace = bbb-ccc -> aaa_bbb_ccc
    This is because customer have no control on SV IDs. If a SV ID has ":", then all SV IDs should have ":" in the same place.

util.NormalizeId coverts ":" to "_" may because the SV name (not ID) can be like "domain-c10:uuid". Maybe in previous releases, WCP once configured NCP ConfigMap with this name.

Copy link

@edwardbadboy edwardbadboy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo in PR title and commit message title: ClusterClontrolPlane -> ClusterControlPlane

Signed-off-by: Kumar Atish <kumar.atish@broadcom.com>
@Atish-iaf Atish-iaf changed the title Improve uniqueness of ClusterClontrolPlane name Improve uniqueness of ClusterControlPlane name Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants