Sign the gateway's certificate with the Dino CA#185
Conversation
2a3497a to
09dc51e
Compare
|
I'm not sure what the motivating factor is for implementing this, however this was historically not implemented as there is actually a flaw in the way that operator does automated management of certificates in that whatever secret you install for CAO to use for gateway ends up being used as-is, leading to all CNG instances having the same certificate, but this doesn't correctly manage the need for gateway certificate hostnames to match the individual pod names. Instead, we've allowed the gateway services themselves to 'float' with their self-signed certificates, and then leverage the gateway load balancers to serve a verifiable certificate. On a related note, there is actually an issue with how the load balancers were set up, making it impossible to do full end-to-end certificate verification until the new ROSA cluster is used which enables a new form of load balancer configuration. |
|
My motivation was that .NET doesn't have a way to connect to CNG without bypassing certificate chain validation entirely. We do There is actually a bug in .NET where the intermediate certificates presented by the server are not being added to the chain when validating, which I found out about with this PR. It served at least to that extent 😄
Right, does that mean even if I were putting the SDK inside the cluster and connecting with the pod's hostname I'd still get Happy to drop the PR. Is there something else I could do or is waiting for the ROSA cluster the logical option here? |
This is the case yes, and I believe this would actually be testing the incorrect behaviour, as in this case you would be testing that the SDK directly accepts the leaf certificate, but this should actually fail as the SDK is meant to be configured with a CA that can validate the leaf certificate instead (and generally TLS libraries require you to go out of your way to directly accept a certificate, since its not generally meant to be set up that way, and not how an SDK should handle certs). Waiting to switch to the new ROSA cluster, and altering the cbdinocluster testing to leverage the new Gateway (with a capital G, as in the resource type in K8S) option that's available (see 233ae4c and the related commits around it) is the right option. This configuration is the only one I was able to get to work end-to-end with the SDKs that also tests "the right things". |
There was a problem hiding this comment.
Pull request overview
This PR updates the CAO deployer to provision and serve a DinoRoot-signed certificate chain for the Cloud Native Gateway (CNG), fixing TLS chain validation and making cbdino certificates get-gateway-ca return the actual Gateway CA (intermediate) instead of the gateway leaf.
Changes:
- Provision a Dino-signed gateway TLS secret and configure CAO CNG to use it (
serverSecretName). - Regenerate and return the deterministic Gateway CA for
get-gateway-carather than reading the operator’s self-signed leaf cert secret. - Add unit tests validating leaf → gateway CA → DinoRoot chain correctness.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| deployment/caodeploy/deployer.go | Wires the Dino-signed gateway TLS secret into the cluster spec and updates GetGatewayCertificate behavior. |
| deployment/caodeploy/deployer_certs.go | Adds deterministic gateway CA + TLS secret generation and secret provisioning logic. |
| deployment/caodeploy/deployer_certs_test.go | Adds tests verifying the generated cert chain and get-gateway-ca semantics. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func getGatewayDinoCA(clusterID string) (*dinocerts.CertAuthority, []byte, error) { | ||
| rootCa, err := dinocerts.GetRootCertAuthority() | ||
| if err != nil { | ||
| return nil, nil, errors.Wrap(err, "failed to get root dino ca") | ||
| } | ||
|
|
||
| gatewayCa, err := rootCa.MakeIntermediaryCA("gateway-" + clusterID[:8]) | ||
| if err != nil { | ||
| return nil, nil, errors.Wrap(err, "failed to make gateway dino ca") | ||
| } |
| // The gateway CA is deterministic, so regenerate it instead of reading the | ||
| // secret back from the cluster. | ||
| gatewayCa, _, err := getGatewayDinoCA(clusterID) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
|
|
||
| secret, err := d.client.GetSecret(ctx, namespaceName, "couchbase-cloud-native-gateway-self-signed-secret-cluster") | ||
| if err != nil { | ||
| return "", errors.Wrap(err, "failed to get secret") | ||
| } | ||
|
|
||
| secretData := secret.Data["tls.crt"] | ||
| if len(secretData) == 0 { | ||
| return "", errors.New("secret data was unexpectedly empty") | ||
| return "", errors.Wrap(err, "failed to get gateway CA") | ||
| } | ||
|
|
||
| return string(secretData), nil | ||
| return string(gatewayCa.CertPem), nil |
cbdino certificates get-gateway-caused to return the gateway's leaf certificate which wasn't signed by the DinoRoot, which broke chain validation.This PR makes it so the SDK is presented with the gateway's leaf and CA when connecting, so the chain can properly validated up to the root, and so
get-gateway-caproperly returns the Gateway's CA.