feat(results-cache): Add TLS support for external MongoDB/DocumentDB connections#2226
feat(results-cache): Add TLS support for external MongoDB/DocumentDB connections#2226chandle-revans wants to merge 3 commits intoy-scope:mainfrom
Conversation
WalkthroughThis pull request adds MongoDB TLS support throughout CLP, enabling secure connections to TLS-enforcing databases like AWS DocumentDB. Changes include Python configuration models with TLS validation, TypeScript URI construction logic, Helm templates for CA certificate volume mounting, and comprehensive documentation on configuring and deploying TLS-secured results caches. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Ruff (0.15.11)components/clp-py-utils/clp_py_utils/clp_config.pyUnexpected Ruff issue shape at index 94 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/clp-py-utils/clp_py_utils/clp_config.py`:
- Around line 520-529: The MongoDB URI builder in get_uri currently injects
self.tls_ca_file unencoded into the query string, which breaks for paths with
reserved URI chars; update get_uri to percent-encode the tls_ca_file value (e.g.
using urllib.parse.quote) before appending f"tlsCAFile=..." so characters like
spaces, :, \, &, ? etc. are safely encoded; ensure you import urllib.parse.quote
at top and only encode the tls_ca_file when it is truthy, leaving other params
unchanged.
In `@components/webui/server/src/plugins/external/mongo.ts`:
- Around line 12-14: The mongoDbTlsCaFile value is concatenated raw into the
connection URI which can break the query string if it contains reserved
characters; change the append logic to URL-encode mongoDbTlsCaFile (use
encodeURIComponent or equivalent) when building the url variable so the
tlsCAFile query param is safe, and apply the same encoding pattern to the other
places mentioned (e.g., ResultsCache.get_uri and the Helm helper
clp.resultsCacheUri) for consistency.
- Around line 8-9: The settings type assertions are unsafe: remove the `as`
casts on settings.MongoDbTls and settings.MongoDbTlsCaFile and instead coerce or
validate at runtime; for mongoDbTls (variable mongoDbTls) use an explicit
default/coercion from settings.MongoDbTls (e.g. treat missing/invalid as false),
and for mongoDbTlsCaFile (variable mongoDbTlsCaFile) validate that
settings.MongoDbTlsCaFile is a string and otherwise set it to null; modify the
code in the module that references settings to perform these runtime checks
rather than using `as` to silence the compiler.
In `@docs/src/user-docs/guides-external-database.md`:
- Line 280: Update the `[helm-values]` reference so it uses a relative
repository path to the values.yaml file instead of the hard-coded GitHub URL
that points to `main`; locate the `[helm-values]` link definition and replace
the absolute GitHub URL with a relative file path (so it follows the docs build
ref), and if you need to link to the directory `tools/deployment/package-helm/`
use a GitHub URL that injects the DOCS_VAR_CLP_GIT_REF variable (not `main`) to
keep the directory-level reference version-agnostic.
- Around line 224-228: Update the AWS DocumentDB CA bundle download snippet to
instruct users to verify the file integrity and authenticity: mention including
an expected checksum (e.g., SHA-256) alongside the wget URL and show how to
verify with sha256sum (compare against the published checksum), and add a short
note that HTTPS/TLS validates the server by default so users should ensure they
fetch from the official AWS URL (or verify the checksum) before trusting the CA
bundle; reference the existing wget download line so reviewers can replace or
augment that snippet with the verification guidance.
In `@tools/deployment/package-helm/templates/_helpers.tpl`:
- Around line 446-453: The helpers clp.resultsCacheTlsVolumeMount and
clp.resultsCacheTlsVolume currently assume
.Values.clpConfig.results_cache.tlsVolume is defined; update these helper
templates to either (a) use the required function to fail fast with a clear
error (e.g. required "message" .Values.clpConfig.results_cache.tlsVolume) before
dereferencing volume.* and mount.* or (b) wrap all dereferences in an if block
that returns nothing when .Values.clpConfig.results_cache.tlsVolume is nil/empty
so the helper becomes a no-op; apply the same pattern to both helpers so callers
no longer get confusing template errors.
In `@tools/deployment/package-helm/values.yaml`:
- Line 199: The example value mount.readOnly is misleading because the template
helper clp.resultsCacheTlsVolumeMount in _helpers.tpl hardcodes readOnly: true
and ignores tlsVolume.mount.readOnly; either remove mount.readOnly from the
example in values.yaml or modify the helper clp.resultsCacheTlsVolumeMount to
honor the value by reading
.Values.clpConfig.results_cache.tlsVolume.mount.readOnly (with a default true)
so user changes to mount.readOnly take effect.
- Around line 186-200: The values.yaml entry tlsVolume lacks validation and can
produce invalid Pod specs; add validation to ensure tlsVolume has the required
shape (volume.name and mount.mountPath present, mount.subPath optional) by
either adding a values.schema.json JSON Schema entry for tlsVolume (require
properties: volume object with required name, mount object with required
mountPath, optional subPath) or adding a fail check in the Helm helpers (e.g.,
in _helpers.tpl) that inspects .Values.tlsVolume and calls fail with a clear
message if volume.name or mount.mountPath are missing; reference
clp.resultsCacheTlsVolume and clp.resultsCacheTlsVolumeMount to ensure the same
validated keys are used across templates.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3d59b046-76c0-46b9-9bed-dc0c57be5b6b
📒 Files selected for processing (14)
components/clp-py-utils/clp_py_utils/clp_config.pycomponents/clp-py-utils/tests/test_results_cache.pycomponents/webui/server/settings.jsoncomponents/webui/server/src/plugins/external/mongo.tsdocs/src/user-docs/guides-external-database.mdtools/deployment/package-helm/templates/_helpers.tpltools/deployment/package-helm/templates/configmap.yamltools/deployment/package-helm/templates/garbage-collector-deployment.yamltools/deployment/package-helm/templates/query-scheduler-deployment.yamltools/deployment/package-helm/templates/query-worker-deployment.yamltools/deployment/package-helm/templates/reducer-deployment.yamltools/deployment/package-helm/templates/results-cache-indices-creator-job.yamltools/deployment/package-helm/templates/webui-deployment.yamltools/deployment/package-helm/values.yaml
| {{- define "clp.resultsCacheTlsVolumeMount" -}} | ||
| name: {{ .Values.clpConfig.results_cache.tlsVolume.volume.name | quote }} | ||
| mountPath: {{ .Values.clpConfig.results_cache.tlsVolume.mount.mountPath | quote }} | ||
| {{- if .Values.clpConfig.results_cache.tlsVolume.mount.subPath }} | ||
| subPath: {{ .Values.clpConfig.results_cache.tlsVolume.mount.subPath | quote }} | ||
| {{- end }} | ||
| readOnly: true | ||
| {{- end }} |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Helper assumes tlsVolume is non-null; add a guard or document the precondition.
clp.resultsCacheTlsVolumeMount and clp.resultsCacheTlsVolume dereference .Values.clpConfig.results_cache.tlsVolume.volume.* / .mount.* directly. If a future caller forgets the {{- if .Values.clpConfig.results_cache.tlsVolume }} guard (or tlsVolume is set to an empty map rather than null), Helm will render confusing errors.
Consider either (a) failing fast with a clear message using required, or (b) making the helpers internally no-op when tlsVolume is unset, so call sites don't all need the guard:
♻️ Option A — fail fast with a clear message
{{- define "clp.resultsCacheTlsVolumeMount" -}}
-name: {{ .Values.clpConfig.results_cache.tlsVolume.volume.name | quote }}
-mountPath: {{ .Values.clpConfig.results_cache.tlsVolume.mount.mountPath | quote }}
+{{- $tlsVolume := .Values.clpConfig.results_cache.tlsVolume | required
+ "clpConfig.results_cache.tlsVolume must be set to use this helper" -}}
+name: {{ $tlsVolume.volume.name | quote }}
+mountPath: {{ $tlsVolume.mount.mountPath | quote }}
-{{- if .Values.clpConfig.results_cache.tlsVolume.mount.subPath }}
-subPath: {{ .Values.clpConfig.results_cache.tlsVolume.mount.subPath | quote }}
+{{- with $tlsVolume.mount.subPath }}
+subPath: {{ . | quote }}
{{- end }}
readOnly: true
{{- end }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/templates/_helpers.tpl` around lines 446 - 453,
The helpers clp.resultsCacheTlsVolumeMount and clp.resultsCacheTlsVolume
currently assume .Values.clpConfig.results_cache.tlsVolume is defined; update
these helper templates to either (a) use the required function to fail fast with
a clear error (e.g. required "message"
.Values.clpConfig.results_cache.tlsVolume) before dereferencing volume.* and
mount.* or (b) wrap all dereferences in an if block that returns nothing when
.Values.clpConfig.results_cache.tlsVolume is nil/empty so the helper becomes a
no-op; apply the same pattern to both helpers so callers no longer get confusing
template errors.
| # Optional volume and mount to inject a TLS CA certificate into pods that connect to the | ||
| # results cache. When set, the volume is added to all pods that use the results cache | ||
| # connection (query-scheduler, query-worker, garbage-collector, reducer, webui, and | ||
| # results-cache-indices-creator). | ||
| # Example: | ||
| # tlsVolume: | ||
| # volume: | ||
| # name: results-cache-ca | ||
| # secret: | ||
| # secretName: my-docdb-ca-bundle | ||
| # mount: | ||
| # mountPath: /etc/ssl/certs/results-cache-ca.pem | ||
| # subPath: global-bundle.pem | ||
| # readOnly: true | ||
| tlsVolume: null |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider documenting/validating the required shape of tlsVolume.
tlsVolume is consumed across multiple templates (clp.resultsCacheTlsVolume, clp.resultsCacheTlsVolumeMount) and must contain exactly volume.name, mount.mountPath, and optionally mount.subPath. A malformed tlsVolume (e.g., missing volume.name) will silently render an invalid Pod spec and fail at kubectl apply.
Consider adding a values.schema.json entry (or a Helm fail check in _helpers.tpl) to catch misconfiguration earlier with a clear error message. Optional — helpful for operators adopting TLS.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/values.yaml` around lines 186 - 200, The
values.yaml entry tlsVolume lacks validation and can produce invalid Pod specs;
add validation to ensure tlsVolume has the required shape (volume.name and
mount.mountPath present, mount.subPath optional) by either adding a
values.schema.json JSON Schema entry for tlsVolume (require properties: volume
object with required name, mount object with required mountPath, optional
subPath) or adding a fail check in the Helm helpers (e.g., in _helpers.tpl) that
inspects .Values.tlsVolume and calls fail with a clear message if volume.name or
mount.mountPath are missing; reference clp.resultsCacheTlsVolume and
clp.resultsCacheTlsVolumeMount to ensure the same validated keys are used across
templates.
| # mount: | ||
| # mountPath: /etc/ssl/certs/results-cache-ca.pem | ||
| # subPath: global-bundle.pem | ||
| # readOnly: true |
There was a problem hiding this comment.
mount.readOnly in the example is ignored by the helper.
clp.resultsCacheTlsVolumeMount in _helpers.tpl hard-codes readOnly: true and never reads tlsVolume.mount.readOnly. Including readOnly: true in this example is harmless today, but misleading: a user who copies the example and sets readOnly: false would incorrectly assume their value takes effect.
Either drop this key from the example, or update the helper to honor it (e.g., readOnly: {{ .Values.clpConfig.results_cache.tlsVolume.mount.readOnly | default true }}).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/deployment/package-helm/values.yaml` at line 199, The example value
mount.readOnly is misleading because the template helper
clp.resultsCacheTlsVolumeMount in _helpers.tpl hardcodes readOnly: true and
ignores tlsVolume.mount.readOnly; either remove mount.readOnly from the example
in values.yaml or modify the helper clp.resultsCacheTlsVolumeMount to honor the
value by reading .Values.clpConfig.results_cache.tlsVolume.mount.readOnly (with
a default true) so user changes to mount.readOnly take effect.
|
Hi @chandle-revans , thank you for your contribution! We are looking into this internally and will review this PR in the next two days. |
There was a problem hiding this comment.
Hi @chandle-revans, thanks for the contribution! This PR looks solid overall. I’ve left a few comments below
If you're tied up with other priorities, just let us know. We’re happy to handle these edits for you, provided you're happy with the high-level design.
There was a problem hiding this comment.
The workflow doesn't pass. Can you run npm run lint:fix under components/webui and fix lint errors?
There was a problem hiding this comment.
Can you bump version in Charts.yaml to 0.3.2-dev.1 to satisfy helm lint?
| [docdb-ca]: https://docs.aws.amazon.com/documentdb/latest/developerguide/security.encryption.ssl.html | ||
| [docdb-tls-param]: https://docs.aws.amazon.com/documentdb/latest/developerguide/security.encryption.in-transit.html | ||
| [docker-compose-orchestration]: guides-docker-compose-deployment.md | ||
| [helm-values]: https://github.com/y-scope/clp/blob/main/tools/deployment/package-helm/values.yaml |
There was a problem hiding this comment.
| [helm-values]: https://github.com/y-scope/clp/blob/main/tools/deployment/package-helm/values.yaml | |
| [helm-values]: https://github.com/y-scope/clp/blob/DOCS_VAR_CLP_GIT_REF/tools/deployment/package-helm/values.yaml |
There was a problem hiding this comment.
- Can we rename this file as
test_clp_config.py? - pytest isn't configured -- this is the first unit test in this component. Can we add
pytesttopyproject.toml?
diff --git a/components/clp-py-utils/pyproject.toml b/components/clp-py-utils/pyproject.toml
index 46bd38a9..40e06835 100644
--- a/components/clp-py-utils/pyproject.toml
+++ b/components/clp-py-utils/pyproject.toml
@@ -28,6 +28,7 @@ clp = [
]
dev = [
"mypy>=1.18.2",
+ "pytest>=9.0.2",
"ruff>=0.14.4",
]| def get_uri(self): | ||
| return f"mongodb://{self.host}:{self.port}/{self.db_name}" | ||
| uri = f"mongodb://{self.host}:{self.port}/{self.db_name}" | ||
| params = [] | ||
| if self.tls: | ||
| params.append("tls=true") | ||
| if self.tls_ca_file: | ||
| params.append(f"tlsCAFile={self.tls_ca_file}") | ||
| if params: | ||
| uri += "?" + "&".join(params) | ||
| return uri |
There was a problem hiding this comment.
- We might need to escape url parameters since
tlsCAFilecould contain spaces in the filename.urlencodecan escape reserved url chars and join params with&. - If you apply the suggestion below, you need to import quote:
from urllib.parse import quote, urlencode
| def get_uri(self): | |
| return f"mongodb://{self.host}:{self.port}/{self.db_name}" | |
| uri = f"mongodb://{self.host}:{self.port}/{self.db_name}" | |
| params = [] | |
| if self.tls: | |
| params.append("tls=true") | |
| if self.tls_ca_file: | |
| params.append(f"tlsCAFile={self.tls_ca_file}") | |
| if params: | |
| uri += "?" + "&".join(params) | |
| return uri | |
| def get_uri(self): | |
| uri = f"mongodb://{self.host}:{self.port}/{self.db_name}" | |
| query: dict[str, str] = {} | |
| if self.tls: | |
| query["tls"] = "true" | |
| if self.tls_ca_file: | |
| query["tlsCAFile"] = self.tls_ca_file | |
| if query: | |
| uri += "?" + urlencode(query, safe="", quote_via=quote) | |
| return uri |
If you agree with the suggestions above, we also need to update unit tests because urlencode escapes /
diff --git a/components/clp-py-utils/tests/test_results_cache.py b/components/clp-py-utils/tests/test_resul
ts_cache.py
index 64b586ef..262908e2 100644
--- a/components/clp-py-utils/tests/test_results_cache.py
+++ b/components/clp-py-utils/tests/test_results_cache.py
@@ -30,10 +30,26 @@ def test_get_uri_tls_with_ca_file():
assert (
cache.get_uri()
== "mongodb://localhost:27017/clp-query-results"
- "?tls=true&tlsCAFile=/etc/ssl/certs/global-bundle.pem"
+ "?tls=true&tlsCAFile=%2Fetc%2Fssl%2Fcerts%2Fglobal-bundle.pem"
)
+def test_get_uri_tls_with_ca_file_url_encoded():
+ """Special characters in tls_ca_file are percent-encoded."""
+ cache = ResultsCache(tls=True, tls_ca_file="/etc/ssl/certs/ca & bundle.pem")
+ assert (
+ cache.get_uri()
+ == "mongodb://localhost:27017/clp-query-results"
+ "?tls=true&tlsCAFile=%2Fetc%2Fssl%2Fcerts%2Fca%20%26%20bundle.pem"
+ )
+
+
+def test_get_uri_special_chars_in_host_and_db():
+ """Special characters in host and db_name are percent-encoded."""
+ cache = ResultsCache(host="my host", db_name="my/db")
+ assert cache.get_uri() == "mongodb://my%20host:27017/my%2Fdb"
+
+| export const autoConfig = () => { | ||
| let url = `mongodb://${settings.MongoDbHost}:${settings.MongoDbPort}/${settings.MongoDbName}?directConnection=true`; | ||
| const mongoDbTls = settings.MongoDbTls as boolean; | ||
| const mongoDbTlsCaFile = settings.MongoDbTlsCaFile as string | null; | ||
| if (mongoDbTls) { | ||
| url += "&tls=true"; | ||
| if (mongoDbTlsCaFile) { | ||
| url += `&tlsCAFile=${mongoDbTlsCaFile}`; | ||
| } | ||
| } | ||
| return { | ||
| forceClose: true, | ||
| url: `mongodb://${settings.MongoDbHost}:${settings.MongoDbPort}/${settings.MongoDbName}?directConnection=true`, | ||
| url, | ||
| }; |
There was a problem hiding this comment.
We probably also need to escape url parameters here. How about:
| export const autoConfig = () => { | |
| let url = `mongodb://${settings.MongoDbHost}:${settings.MongoDbPort}/${settings.MongoDbName}?directConnection=true`; | |
| const mongoDbTls = settings.MongoDbTls as boolean; | |
| const mongoDbTlsCaFile = settings.MongoDbTlsCaFile as string | null; | |
| if (mongoDbTls) { | |
| url += "&tls=true"; | |
| if (mongoDbTlsCaFile) { | |
| url += `&tlsCAFile=${mongoDbTlsCaFile}`; | |
| } | |
| } | |
| return { | |
| forceClose: true, | |
| url: `mongodb://${settings.MongoDbHost}:${settings.MongoDbPort}/${settings.MongoDbName}?directConnection=true`, | |
| url, | |
| }; | |
| export const autoConfig = () => { | |
| const params: Record<string, string> = {directConnection: "true"}; | |
| if (settings.MongoDbTls) { | |
| params["tls"] = "true"; | |
| if (settings.MongoDbTlsCaFile) { | |
| params["tlsCAFile"] = settings.MongoDbTlsCaFile; | |
| } | |
| } | |
| const authority = `${settings.MongoDbHost}:${settings.MongoDbPort}`; | |
| const path = settings.MongoDbName; | |
| const query = new URLSearchParams(params).toString(); | |
| const url = `mongodb://${authority}/${path}?${query}`; | |
| return { | |
| forceClose: true, | |
| url: url, | |
| }; | |
| }; |
| # Optional volume and mount to inject a TLS CA certificate into pods that connect to the | ||
| # results cache. When set, the volume is added to all pods that use the results cache | ||
| # connection (query-scheduler, query-worker, garbage-collector, reducer, webui, and | ||
| # results-cache-indices-creator). | ||
| # Example: | ||
| # tlsVolume: | ||
| # volume: | ||
| # name: results-cache-ca | ||
| # secret: | ||
| # secretName: my-docdb-ca-bundle | ||
| # mount: | ||
| # mountPath: /etc/ssl/certs/results-cache-ca.pem | ||
| # subPath: global-bundle.pem | ||
| # readOnly: true | ||
| tlsVolume: null |
There was a problem hiding this comment.
One high level comment on changes to helm charts:
I’d suggest avoiding user-configurable volume mounts in values.yaml to keep things simple. Additionally, since a root CA isn't sensitive data, using a Kubernetes Secret feels like overkill.
How about adding the CA file to a ConfigMap and mounting it to a fixed location across all pods. This streamlines the TLS setup for users
helm install ... \
--set clpConfig.results_cache.tls=true
--set-file clpConfig.results_cache.tls_ca_cert_content=/path/to/ca.crt
See details in the prototype diff below:
diff --git a/tools/deployment/package-helm/templates/_helpers.tpl b/tools/deployment/package-helm/templates/_helpers.tpl
index db2d5a80..0a0e71fa 100644
--- a/tools/deployment/package-helm/templates/_helpers.tpl
+++ b/tools/deployment/package-helm/templates/_helpers.tpl
@@ -436,3 +436,59 @@ topologySpreadConstraints:
{{- toYaml . | nindent 2 }}
{{- end }}
{{- end }}{{/* define "clp.createSchedulingConfigs" */}}
+
+{{/*
+Path inside the container where the results cache CA certificate is mounted.
+
+@return {string} The pod-local path for the results cache CA certificate
+*/}}
+{{- define "clp.resultsCacheTlsCaFile" -}}
+/etc/clp-results-cache-ca-cert/ca.crt
+{{- end }}
+
+{{/*
+Creates a volumeMount for the results cache TLS CA certificate.
+
+@return {string} YAML-formatted volumeMount definition
+*/}}
+{{- define "clp.resultsCacheTlsVolumeMount" -}}
+name: "results-cache-ca-cert"
+mountPath: {{ include "clp.resultsCacheTlsCaFile" . | dir | quote }}
+readOnly: true
+{{- end }}
+
+{{/*
+Creates a volume for the results cache TLS CA certificate, referencing the ConfigMap.
+
+@param {object} root Root template context
+@return {string} YAML-formatted volume definition
+*/}}
+{{- define "clp.resultsCacheTlsVolume" -}}
+name: "results-cache-ca-cert"
+configMap:
+ name: {{ include "clp.fullname" .root }}-config
+ items:
+ - key: "results-cache-ca.crt"
+ path: "ca.crt"
+{{- end }}
+
+{{/*
+Builds the full results cache MongoDB URI, including TLS query parameters when configured.
+
+@param {object} . Root template context
+@return {string} The MongoDB connection URI
+*/}}
+{{- define "clp.resultsCacheUri" -}}
+{{- $base := printf "mongodb://%s:%s/%s"
+ (include "clp.resultsCacheHost" .)
+ (include "clp.resultsCachePort" . | toString)
+ .Values.clpConfig.results_cache.db_name
+-}}
+{{- if and .Values.clpConfig.results_cache.tls .Values.clpConfig.results_cache.tls_ca_cert_content -}}
+{{- printf "%s?tls=true&tlsCAFile=%s" $base (include "clp.resultsCacheTlsCaFile" .) -}}
+{{- else if .Values.clpConfig.results_cache.tls -}}
+{{- printf "%s?tls=true" $base -}}
+{{- else -}}
+{{- $base -}}
+{{- end -}}
+{{- end }}
diff --git a/tools/deployment/package-helm/templates/api-server-deployment.yaml b/tools/deployment/package-helm/templates/api-server-deployment.yaml
index d36bb1e3..a4341b27 100644
--- a/tools/deployment/package-helm/templates/api-server-deployment.yaml
+++ b/tools/deployment/package-helm/templates/api-server-deployment.yaml
@@ -56,6 +56,9 @@ spec:
mountPath: "/etc/clp-config.yaml"
subPath: "clp-config.yaml"
readOnly: true
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolumeMount" . | nindent 14 }}
+ {{- end }}
{{- if eq .Values.clpConfig.stream_output.storage.type "fs" }}
- name: {{ include "clp.volumeName" (dict
"component_category" "shared-data"
@@ -81,6 +84,9 @@ spec:
- name: "config"
configMap:
name: {{ include "clp.fullname" . }}-config
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolume" (dict "root" .) | nindent 10 }}
+ {{- end }}
{{- if eq .Values.clpConfig.stream_output.storage.type "fs" }}
- {{- include "clp.pvcVolume" (dict
"root" .
diff --git a/tools/deployment/package-helm/templates/configmap.yaml b/tools/deployment/package-helm/templates/configmap.yaml
index a66be78d..2c4c86ae 100644
--- a/tools/deployment/package-helm/templates/configmap.yaml
+++ b/tools/deployment/package-helm/templates/configmap.yaml
@@ -164,6 +164,10 @@ data:
retention_period: null
{{- end }}
stream_collection_name: {{ .Values.clpConfig.results_cache.stream_collection_name | quote }}
+ tls: {{ .Values.clpConfig.results_cache.tls }}
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ tls_ca_file: {{ include "clp.resultsCacheTlsCaFile" . | quote }}
+ {{- end }}
{{- with .Values.clpConfig.stream_output }}
stream_output:
{{- with .storage }}
@@ -278,6 +282,10 @@ data:
"MongoDbHost": "{{ include "clp.resultsCacheHost" . }}",
"MongoDbPort": {{ include "clp.resultsCachePort" . | int }},
"MongoDbName": {{ .Values.clpConfig.results_cache.db_name | quote }},
+ "MongoDbTls": {{ .Values.clpConfig.results_cache.tls }},
+{{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ "MongoDbTlsCaFile": {{ include "clp.resultsCacheTlsCaFile" . | quote }},
+{{- end }}
"MongoDbSearchResultsMetadataCollectionName":
{{ .Values.clpConfig.webui.results_metadata_collection_name | quote }},
"MongoDbStreamFilesCollectionName":
@@ -435,3 +443,7 @@ data:
echo "node.id=$(hostname)" >> "${PRESTO_CONFIG_DIR}/node.properties"
{{- end }}{{/* with .Values.clpConfig.presto */}}
{{- end }}{{/* if has "presto" .Values.clpConfig.bundled */}}
+{{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ results-cache-ca.crt: |
+{{ .Values.clpConfig.results_cache.tls_ca_cert_content | indent 4 }}
+{{- end }}
diff --git a/tools/deployment/package-helm/templates/garbage-collector-deployment.yaml b/tools/deployment/package-helm/templates/garbage-collector-deployment.yaml
index 1b202b15..a93123e2 100644
--- a/tools/deployment/package-helm/templates/garbage-collector-deployment.yaml
+++ b/tools/deployment/package-helm/templates/garbage-collector-deployment.yaml
@@ -74,6 +74,9 @@ spec:
{{- if .Values.clpConfig.aws_config_directory }}
- {{- include "clp.awsConfigVolumeMount" . | nindent 14 }}
{{- end }}
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolumeMount" . | nindent 14 }}
+ {{- end }}
{{- if eq .Values.clpConfig.stream_output.storage.type "fs" }}
- name: {{ include "clp.volumeName" (dict
"component_category" "shared-data"
@@ -105,6 +108,9 @@ spec:
{{- if .Values.clpConfig.aws_config_directory }}
- {{- include "clp.awsConfigVolume" . | nindent 10 }}
{{- end }}
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolume" (dict "root" .) | nindent 10 }}
+ {{- end }}
{{- if eq .Values.clpConfig.stream_output.storage.type "fs" }}
- {{- include "clp.pvcVolume" (dict
"root" .
diff --git a/tools/deployment/package-helm/templates/query-scheduler-deployment.yaml b/tools/deployment/package-helm/templates/query-scheduler-deployment.yaml
index a61816f3..0145e3e7 100644
--- a/tools/deployment/package-helm/templates/query-scheduler-deployment.yaml
+++ b/tools/deployment/package-helm/templates/query-scheduler-deployment.yaml
@@ -74,6 +74,9 @@ spec:
mountPath: "/etc/clp-config.yaml"
subPath: "clp-config.yaml"
readOnly: true
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolumeMount" . | nindent 14 }}
+ {{- end }}
command: [
"python3", "-u",
"-m", "job_orchestration.scheduler.query.query_scheduler",
@@ -90,4 +93,7 @@ spec:
- name: "config"
configMap:
name: {{ include "clp.fullname" . }}-config
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolume" (dict "root" .) | nindent 10 }}
+ {{- end }}
{{- end }}
diff --git a/tools/deployment/package-helm/templates/query-worker-deployment.yaml b/tools/deployment/package-helm/templates/query-worker-deployment.yaml
index 6e6a67b1..e0a3f9b2 100644
--- a/tools/deployment/package-helm/templates/query-worker-deployment.yaml
+++ b/tools/deployment/package-helm/templates/query-worker-deployment.yaml
@@ -66,6 +66,9 @@ spec:
{{- if .Values.clpConfig.aws_config_directory }}
- {{- include "clp.awsConfigVolumeMount" . | nindent 14 }}
{{- end }}
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolumeMount" . | nindent 14 }}
+ {{- end }}
{{- if eq .Values.clpConfig.stream_output.storage.type "fs" }}
- name: {{ include "clp.volumeName" (dict
"component_category" "shared-data"
@@ -108,6 +111,9 @@ spec:
{{- if .Values.clpConfig.aws_config_directory }}
- {{- include "clp.awsConfigVolume" . | nindent 10 }}
{{- end }}
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolume" (dict "root" .) | nindent 10 }}
+ {{- end }}
{{- if eq .Values.clpConfig.stream_output.storage.type "fs" }}
- {{- include "clp.pvcVolume" (dict
"root" .
diff --git a/tools/deployment/package-helm/templates/reducer-deployment.yaml b/tools/deployment/package-helm/templates/reducer-deployment.yaml
index 1b33d03c..e6f10988 100644
--- a/tools/deployment/package-helm/templates/reducer-deployment.yaml
+++ b/tools/deployment/package-helm/templates/reducer-deployment.yaml
@@ -53,6 +53,9 @@ spec:
mountPath: "/etc/clp-config.yaml"
subPath: "clp-config.yaml"
readOnly: true
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolumeMount" . | nindent 14 }}
+ {{- end }}
command: [
"python3", "-u",
"-m", "job_orchestration.reducer.reducer",
@@ -64,4 +67,7 @@ spec:
- name: "config"
configMap:
name: {{ include "clp.fullname" . }}-config
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolume" (dict "root" .) | nindent 10 }}
+ {{- end }}
{{- end }}
diff --git a/tools/deployment/package-helm/templates/results-cache-indices-creator-job.yaml b/tools/deployment/package-helm/templates/results-cache-indices-creator-job.yaml
index c2074bf0..3a28a1eb 100644
--- a/tools/deployment/package-helm/templates/results-cache-indices-creator-job.yaml
+++ b/tools/deployment/package-helm/templates/results-cache-indices-creator-job.yaml
@@ -33,9 +33,15 @@ spec:
"python3", "-u",
"-m", "clp_py_utils.initialize-results-cache",
"--uri",
- "mongodb://{{ include "clp.resultsCacheHost" . }}:{{
- include "clp.resultsCachePort" . | int
- }}/{{ .Values.clpConfig.results_cache.db_name }}",
+ "{{ include "clp.resultsCacheUri" . }}",
"--stream-collection",
{{ .Values.clpConfig.results_cache.stream_collection_name | quote }}
]
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ volumeMounts:
+ - {{- include "clp.resultsCacheTlsVolumeMount" . | nindent 14 }}
+ {{- end }}
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ volumes:
+ - {{- include "clp.resultsCacheTlsVolume" (dict "root" .) | nindent 10 }}
+ {{- end }}
diff --git a/tools/deployment/package-helm/templates/webui-deployment.yaml b/tools/deployment/package-helm/templates/webui-deployment.yaml
index 71edbc6e..3c7081bc 100644
--- a/tools/deployment/package-helm/templates/webui-deployment.yaml
+++ b/tools/deployment/package-helm/templates/webui-deployment.yaml
@@ -79,6 +79,9 @@ spec:
{{- if .Values.clpConfig.aws_config_directory }}
- {{- include "clp.awsConfigVolumeMount" . | nindent 14 }}
{{- end }}
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolumeMount" . | nindent 14 }}
+ {{- end }}
{{- if eq .Values.clpConfig.logs_input.type "fs" }}
- {{- include "clp.logsInputVolumeMount" . | nindent 14 }}
{{- end }}
@@ -111,6 +114,9 @@ spec:
{{- if .Values.clpConfig.aws_config_directory }}
- {{- include "clp.awsConfigVolume" . | nindent 10 }}
{{- end }}
+ {{- if .Values.clpConfig.results_cache.tls_ca_cert_content }}
+ - {{- include "clp.resultsCacheTlsVolume" (dict "root" .) | nindent 10 }}
+ {{- end }}
{{- if eq .Values.clpConfig.logs_input.type "fs" }}
- {{- include "clp.logsInputVolume" . | nindent 10 }}
{{- end }}
diff --git a/tools/deployment/package-helm/values.yaml b/tools/deployment/package-helm/values.yaml
index ff4f0ba6..d4a24879 100644
--- a/tools/deployment/package-helm/values.yaml
+++ b/tools/deployment/package-helm/values.yaml
@@ -175,6 +175,15 @@ clpConfig:
# Retention period for search results, in minutes. Set to null to disable automatic deletion.
retention_period: 60
+ # Whether to use TLS for the results cache (MongoDB) connection.
+ tls: false
+
+ # Content of the results cache (MongoDB) CA certificate file.
+ # Provide using --set-file when installing, e.g.:
+ # helm install ... --set-file clpConfig.results_cache.tls_ca_cert_content=/path/to/ca.crt
+ # Only used when tls is true.
+ tls_ca_cert_content: ""
+
compression_worker:
logging_level: "INFO"There was a problem hiding this comment.
Just curious - does this PR enable CLP to run on DocumentDB?
By default, results-cache-indices-creator, webui, and reducer all use features that aren't supported by DocumentDB.
Description
Adds optional TLS support for the results cache (MongoDB) connection across the Python components, the webui server, and the Helm chart. This enables CLP to connect to externally managed MongoDB compatible services that require TLS, most notably AWS DocumentDB, which enforces TLS by default without requiring users to disable TLS on the upstream service.
Changes
clp-py-utils:
ResultsCachemodel (components/clp-py-utils/clp_py_utils/clp_config.py)tls: bool = Falsetls_ca_file: NonEmptyStr | None = Nonemodel_validatorthat rejectstls_ca_filebeing set whiletlsisFalse(prevents silent misconfiguration).get_uri()to append?tls=trueand optionally&tlsCAFile=<path>when configured.Webui server: (
components/webui/server)settings.json: addedMongoDbTls: falseandMongoDbTlsCaFile: nullplaceholders.src/plugins/external/mongo.ts: appends&tls=trueand&tlsCAFile=<path>to the connection URL when configured. Uses explicit type assertions on the JSON-derived literals to satisfy strict TypeScript settings.Helm chart: (
tools/deployment/package-helm)values.yaml: added documentedtls,tls_ca_file, andtlsVolumefields underclpConfig.results_cache. ThetlsVolumeshape mirrors Kubernetes' nativevolume/volumeMountstructures, so users can supply the CA bundle from any source (Secret, ConfigMap, hostPath, etc.) keeping the chart cloud agnostic.templates/_helpers.tpl: three new helpers:clp.resultsCacheTlsVolume/clp.resultsCacheTlsVolumeMount: render the volume/mount fromtlsVolumeconfig.clp.resultsCacheUri: builds the full MongoDB URI including TLS params; output matches Python'sget_uri().templates/configmap.yaml: renderstls/tls_ca_fileintoclp-config.yaml, andMongoDbTls/MongoDbTlsCaFileinto the webui server settings.templates/results-cache-indices-creator-job.yaml: uses theresultsCacheUrihelper, optionally mounts the TLS volume.query-scheduler,query-worker,garbage-collector,reducer,webui. All additions are guarded by{{- if .Values.clpConfig.results_cache.tlsVolume }}so they're no-ops when unused.Tests: (
components/clp-py-utils/tests/test_results_cache.py)New unit tests covering:
?tls=true)?tls=true&tlsCAFile=<path>)tls_ca_fileis set withouttls=TrueDocumentation (
docs/src/user-docs/guides-external-database.md)guides-external-database.md, plus a callout in the AWS DocumentDB/MongoDB Atlas section noting that DocumentDB enforces TLS by default.Backwards Compatibility
Fully backwards-compatible. All new fields default to TLS being disabled:
tls: falsetls_ca_file: nulltlsVolume: nullWhen none of the new fields are set, the rendered Helm output is byte-identical to the previous version aside from two new
tls: false/tls_ca_file: nulllines in theclp-config.yaml(which the updatedResultsCachemodel accepts as defaults). Existing deployments require no changes.Example Usage (AWS DocumentDB)
Checklist
breaking change.
Validation performed
Summary by CodeRabbit
Release Notes
New Features
Documentation