Skip to content

Sign the gateway's certificate with the Dino CA#185

Closed
emilienbev wants to merge 2 commits into
masterfrom
SignGatewayDino
Closed

Sign the gateway's certificate with the Dino CA#185
emilienbev wants to merge 2 commits into
masterfrom
SignGatewayDino

Conversation

@emilienbev

@emilienbev emilienbev commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

cbdino certificates get-gateway-ca used to return the gateway's leaf certificate which wasn't signed by the DinoRoot, which broke chain validation.

This PR makes it so the SDK is presented with the gateway's leaf and CA when connecting, so the chain can properly validated up to the root, and so get-gateway-ca properly returns the Gateway's CA.

@emilienbev emilienbev marked this pull request as ready for review June 8, 2026 12:32
@emilienbev emilienbev marked this pull request as draft June 8, 2026 15:29
@brett19

brett19 commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

I'm not sure what the motivating factor is for implementing this, however this was historically not implemented as there is actually a flaw in the way that operator does automated management of certificates in that whatever secret you install for CAO to use for gateway ends up being used as-is, leading to all CNG instances having the same certificate, but this doesn't correctly manage the need for gateway certificate hostnames to match the individual pod names. Instead, we've allowed the gateway services themselves to 'float' with their self-signed certificates, and then leverage the gateway load balancers to serve a verifiable certificate.

On a related note, there is actually an issue with how the load balancers were set up, making it impossible to do full end-to-end certificate verification until the new ROSA cluster is used which enables a new form of load balancer configuration.

@emilienbev

emilienbev commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

My motivation was that .NET doesn't have a way to connect to CNG without bypassing certificate chain validation entirely. We do RemoteCertificateValidationCallback = (_, _, _, _) => true for the tests/performer and it makes me feel uneasy.

There is actually a bug in .NET where the intermediate certificates presented by the server are not being added to the chain when validating, which I found out about with this PR. It served at least to that extent 😄

whatever secret you install for CAO to use for gateway ends up being used as-is, leading to all CNG instances having the same certificate, but this doesn't correctly manage the need for gateway certificate hostnames to match the individual pod names

Right, does that mean even if I were putting the SDK inside the cluster and connecting with the pod's hostname I'd still get RemoteCertificateNameMismatch errors? (not that I would but out of curiosity).

Happy to drop the PR. Is there something else I could do or is waiting for the ROSA cluster the logical option here?

@brett19

brett19 commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Right, does that mean even if I were putting the SDK inside the cluster and connecting with the pod's hostname I'd still get RemoteCertificateNameMismatch errors?

This is the case yes, and I believe this would actually be testing the incorrect behaviour, as in this case you would be testing that the SDK directly accepts the leaf certificate, but this should actually fail as the SDK is meant to be configured with a CA that can validate the leaf certificate instead (and generally TLS libraries require you to go out of your way to directly accept a certificate, since its not generally meant to be set up that way, and not how an SDK should handle certs). Waiting to switch to the new ROSA cluster, and altering the cbdinocluster testing to leverage the new Gateway (with a capital G, as in the resource type in K8S) option that's available (see 233ae4c and the related commits around it) is the right option. This configuration is the only one I was able to get to work end-to-end with the SDKs that also tests "the right things".

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the CAO deployer to provision and serve a DinoRoot-signed certificate chain for the Cloud Native Gateway (CNG), fixing TLS chain validation and making cbdino certificates get-gateway-ca return the actual Gateway CA (intermediate) instead of the gateway leaf.

Changes:

  • Provision a Dino-signed gateway TLS secret and configure CAO CNG to use it (serverSecretName).
  • Regenerate and return the deterministic Gateway CA for get-gateway-ca rather than reading the operator’s self-signed leaf cert secret.
  • Add unit tests validating leaf → gateway CA → DinoRoot chain correctness.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
deployment/caodeploy/deployer.go Wires the Dino-signed gateway TLS secret into the cluster spec and updates GetGatewayCertificate behavior.
deployment/caodeploy/deployer_certs.go Adds deterministic gateway CA + TLS secret generation and secret provisioning logic.
deployment/caodeploy/deployer_certs_test.go Adds tests verifying the generated cert chain and get-gateway-ca semantics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +21 to +30
func getGatewayDinoCA(clusterID string) (*dinocerts.CertAuthority, []byte, error) {
rootCa, err := dinocerts.GetRootCertAuthority()
if err != nil {
return nil, nil, errors.Wrap(err, "failed to get root dino ca")
}

gatewayCa, err := rootCa.MakeIntermediaryCA("gateway-" + clusterID[:8])
if err != nil {
return nil, nil, errors.Wrap(err, "failed to make gateway dino ca")
}
Comment on lines +1034 to +1041
// The gateway CA is deterministic, so regenerate it instead of reading the
// secret back from the cluster.
gatewayCa, _, err := getGatewayDinoCA(clusterID)
if err != nil {
return "", err
}

secret, err := d.client.GetSecret(ctx, namespaceName, "couchbase-cloud-native-gateway-self-signed-secret-cluster")
if err != nil {
return "", errors.Wrap(err, "failed to get secret")
}

secretData := secret.Data["tls.crt"]
if len(secretData) == 0 {
return "", errors.New("secret data was unexpectedly empty")
return "", errors.Wrap(err, "failed to get gateway CA")
}

return string(secretData), nil
return string(gatewayCa.CertPem), nil
@emilienbev emilienbev closed this Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants