Allow sidecar server to reload TLS certificates#607
Allow sidecar server to reload TLS certificates#607pierDipi wants to merge 3 commits intollm-d:mainfrom
Conversation
Enables TLS certificates to be rotated without restarting sidecar and vLLM deployments. Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
|
/lgtm |
Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
There was a problem hiding this comment.
Pull request overview
This PR moves TLS certificate handling into the sidecar proxy server and adds support for reloading TLS certificates from disk so sidecar/vLLM deployments can rotate certs without restarts.
Changes:
- Add
Config.CertPathandConfig.SecureServing, and remove passing a*tls.CertificateintoServer.Start(). - Configure
http.Server.TLSConfigto useGetCertificate, optionally backed by a cert reloader whenCertPathis set. - Update sidecar main and proxy tests to use the new config-driven TLS setup.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/pd-sidecar/main.go | Moves TLS config inputs into proxy.Config and updates Start() call signature. |
| pkg/sidecar/proxy/proxy.go | Extends Config with TLS fields and updates Start()/data-parallel startup signatures. |
| pkg/sidecar/proxy/proxy_helpers.go | Implements TLS cert loading/self-signed fallback and hooks in cert reloading via GetCertificate. |
| pkg/sidecar/proxy/data_parallel.go | Updates data-parallel startup path to no longer pass a cert into startHTTP(). |
| pkg/sidecar/proxy/proxy_test.go | Updates tests to enable TLS via config rather than injecting a cert. |
| pkg/sidecar/proxy/data_parallel_test.go | Updates test call sites for the new startDataParallel() signature. |
| pkg/sidecar/proxy/connector_test.go | Updates Start() call sites for the new signature. |
| pkg/sidecar/proxy/connector_nixlv2_test.go | Updates Start() call sites for the new signature. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| clone.setConnector() | ||
|
|
||
| return clone.startHTTP(ctx, cert) | ||
| return clone.startHTTP(ctx) |
There was a problem hiding this comment.
Data-parallel clone servers call startHTTP(), which now derives TLS settings (and connector selection) from s.config, but Clone() does not copy the config field. As a result, cloned servers will run with zero-value config (e.g., SecureServing false / CertPath empty), so TLS cert reloading (and potentially connector behavior) won’t apply to the data-parallel proxy ports. Consider copying config (and any other required fields) in Clone(), or explicitly setting clone.config = s.config before calling clone.startHTTP().
| tempCert, err = tls.LoadX509KeyPair(s.config.CertPath+"/tls.crt", s.config.CertPath+"/tls.key") | ||
| } else { | ||
| tempCert, err = CreateSelfSignedTLSCertificate() | ||
| } | ||
| if err != nil { | ||
| return fmt.Errorf("failed to create TLS certificate: %w", err) | ||
| } |
There was a problem hiding this comment.
The error message "failed to create TLS certificate" is used for both loading a keypair from CertPath and generating a self-signed cert. It would be more actionable to distinguish these cases (e.g., include the cert/key filenames when LoadX509KeyPair fails, and a separate message for self-signed generation failure).
| tempCert, err = tls.LoadX509KeyPair(s.config.CertPath+"/tls.crt", s.config.CertPath+"/tls.key") | |
| } else { | |
| tempCert, err = CreateSelfSignedTLSCertificate() | |
| } | |
| if err != nil { | |
| return fmt.Errorf("failed to create TLS certificate: %w", err) | |
| } | |
| certFile := s.config.CertPath + "/tls.crt" | |
| keyFile := s.config.CertPath + "/tls.key" | |
| tempCert, err = tls.LoadX509KeyPair(certFile, keyFile) | |
| if err != nil { | |
| return fmt.Errorf("failed to load TLS key pair from cert %q and key %q: %w", certFile, keyFile, err) | |
| } | |
| } else { | |
| tempCert, err = CreateSelfSignedTLSCertificate() | |
| if err != nil { | |
| return fmt.Errorf("failed to generate self-signed TLS certificate: %w", err) | |
| } | |
| } |
| if s.config.CertPath != "" { | ||
| reloader, err := common.NewCertReloader(ctx, s.config.CertPath, cert) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to start reloader: %w", err) | ||
| } |
There was a problem hiding this comment.
TLS certificate reloading via common.NewCertReloader is newly introduced but doesn’t appear to be covered by tests. Adding an integration/unit test that starts the server with CertPath pointing to a temp dir, rotates tls.crt/tls.key, and verifies a new TLS handshake presents the updated cert would help prevent regressions.
Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
Enables TLS certificates to be rotated without restarting sidecar and vLLM deployments.