-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Allow specifying certificates for TLS communication between KEDA and external scalers #7013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for your contribution! 🙏 Please understand that we will do our best to review your PR and give you feedback as soon as possible, but please bear with us if it takes a little longer as expected. While you are waiting, make sure to:
Once the initial tests are successful, a KEDA member will ensure that the e2e tests are run. Once the e2e tests have been successfully completed, the PR may be merged at a later date. Please be patient. Learn more about our contribution guide. |
I'm not sure if we should read things from file system as part of scaler code. This makes that if some needs to add more certs, the operator has to be restarted. I'd prefer some way where we pull the secret from the k8s api directly and using them without passing via file system. |
yep, that's the goal. Allowing cert manager to update the secrets, which triggers their remounting.
👍 this is a valid point. Adding a new secret will require the pod restart, however, it's still possible to use the old way and provide the certs as inline strings. So this pod restart will be needed only if users care about the cert rotation, which they should. I mean the cert rotation itself doesn't trigger the operator's pod restart, but adding new or deleting old that would change the podSpec will have this effect.. still benefits > drawbacks given the fact that this is an optional feature
can you point me to the code? I was able to find this & this. In both cases, the cert, key and ca is read from the SO from trigger metadata -> can't be rotated + putting this kind of info to SOs is imho a bad practice
cough, cough also kafka scaler :) I am keda noob, but isn't this code evaluated by KEDA operator? KEDA operator code already does this I don't know, polling a secret or having a watcher/informer for a k8s secret just in case it gets updated seems to be less secure to me than mounting these to a pod and reusing the infra that you've introduced. (I am trying to avoid get/list/watch rbac for k8s secrets if possible) If there isn't a demand for such feature, I can close the PR, no hard feelings :) I was just setting up a demo w/ OTel pipelines where all the certs are rotated and this piece of puzzle was missing so once the cert for my external scaler was rotated. KEDA stopped talking to it. So I opened this draft PR. |
981aa3a
to
3786f4b
Compare
…external scalers as file paths. These should be mounted and accessible by KEDA. It uses the prepared infrastructure that reloads the certs if there is a change on FS (fsnotify) Signed-off-by: Jirka Kremser <[email protected]>
3786f4b
to
9a6ca13
Compare
/run-e2e external |
That metadata struct isn't the keda/pkg/scalers/apache_kafka_scaler.go Lines 75 to 80 in 675be9d
you are right, but this is because kerberos SDK only supports reading values from filesystem, so the scaler code saves the info from the k8s api into a local temporal file to provide it to kerberos pkg. Scaler docs also explains that this must be enabled because filesytem is readonly -> https://keda.sh/docs/2.17/scalers/apache-kafka/#your-kafka-cluster-turns-on-saslgssapi-auth-without-tls
My concert using filesystem is that it works quite well when there are a few certificates or teams, but I'm afraid about large environments where restarting KEDA every time when a new cert has to be added can be disruptive in terms of service level and the bast majority of the scaler will be stopped just because of this.
When your cert was rotated, was the k8s secret that KEDA uses rotated too? if yes, KEDA should have failed pulling metrics and during the scaler regeneration the new cert should have been pulled and used. Just to be clear, I'm not trying to block this as if this happened to you, it can happen again to others and we thanks any improvement (and fixing this problem is nice!), but I'm afraid about side effects of introducing KEDA modifications as part of operation teams daily basics. Maybe other @kedacore/keda-core-contributors @kedacore/keda-core-maintainers has any thought that they want to share 😄 |
I commented in the original PR that added Temporal support to KEDA. I think being able to handle certificates on the filesystem that automatically rotate is pretty important for a client using mTLS. The Problem: You can't revoke mTLS certs (CRLs aren't usually used). If you use a centrally trusted CA for mTLS, which is probably the case for people picking mTLS, and you have the server trust certs signed by the CA, it will trust all non-expired certs from that CA. Once a cert is signed by a CA, and trusted, it can't be revoked like an API key. You'd have to stop trusting the CA. As a result of that, if you are using a central CA, the only reasonable thing to do is have short lived certs that rotate frequently. There are different mechanisms to do that. Like you could have automation update a Kubernetes Secret as the cert is rotated. But how do you expose that to the application? The application could use the Kubernetes API to read the Secret, but that would be a pretty strange thing for general applications to do (make sense for KEDA, which is all about Kubernetes). So instead you mount the Secret as a directory. You can't expose the Secret as env vars, because those don't get live updated, but mounted Secrets do get live updated. We actually don't use a Secret, and instead have a sidecar container that writes updated certs/keys to a volume that is shared with the main application containers. Other mechanisms like csi-driver-spiffe also live update certs/keys on the filesystem. The application then needs to start using the newly rotated certs/keys without restarting, next time it needs to connect (it doesn't need to interrupt existing connections, only pick up the new certs/keys for the next connection). That functionality is built into Go using
This is probably me being naive, because I don't know KEDA's internal architecture. But why would the operator have to restart? If the operator can read values from a Secret live without restarting, and those values contain API keys or certs/keys, why can't the values contain a paths to where the certs/keys are on the filesystem? Then the automation that people already have for live updating certs/keys on the filesystem would be responsible for making sure they exist and are rotated. |
IDK exactly how that csi-driver work, but let's say that you need to add another CA because you have a large cluster supporting multiple teams, do you need to change pod spec for it? If yes (like it happens with configMap or secret), the pod has to be restarted |
There is nothing CA specific in the pod spec when using |
Do I need to restart the pod if I need to request a new certificate with other SAN or signed by other CA? |
Certs for triggers that use external scaler pattern can be provided as file paths inside the ScaledObject's definition. These should be mounted and accessible by KEDA. It uses the prepared infrastructure that reloads the certs if there is a change on FS (fsnotify).
Ideally, I'd rename the existing fields from
caCert
,tlsClientCert
andtlsClientKey
tocaCertPem
,tlsClientKeyPem
andtlsClientKeyPem
, so it's the same nomenclature as in here, but this would break the backward compatibility. So there are 3 new fields called:caCertFile
tlsClientCertFile
tlsClientKeyFile
Also if both sets of params are specified, the in-memory ones (
caCert
..) win.todo: