Skip to content

feat(helm): Add Presto query engine support to the Helm chart.#2004

Open
junhaoliao wants to merge 15 commits intoy-scope:mainfrom
junhaoliao:k8s-presto
Open

feat(helm): Add Presto query engine support to the Helm chart.#2004
junhaoliao wants to merge 15 commits intoy-scope:mainfrom
junhaoliao:k8s-presto

Conversation

@junhaoliao
Copy link
Member

@junhaoliao junhaoliao commented Feb 17, 2026

Description

Add Presto coordinator and worker deployments to the CLP Helm chart, gated by clpConfig.presto (non-null enables Presto). When clpConfig.presto is configured, the chart deploys a Presto coordinator (single replica) and configurable Presto worker(s). Presto runs alongside the existing compression pipeline—the two query engines are not mutually exclusive.
The query_engine setting controls which search interface the Web UI displays (clp-s search
vs. Presto SQL queries) but does not gate any infrastructure deployments. The clp-s query
pipeline components (query scheduler, query workers, reducers) and the API server can optionally be disabled by setting their config keys to null to save resources.

Checklist

  • The PR satisfies the [contribution guidelines][yscope-contrib-guidelines].
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

1. Helm template renders correctly (Presto mode)

Task: Verify the Helm chart renders all Presto resources and excludes query/reducer/api-server
templates when Presto is enabled and query pipeline components are set to null.

Command:

cd tools/deployment/package-helm
helm template clp . \
  --set "clpConfig.package.query_engine=presto" \
  --set "clpConfig.presto.port=30889" \
  --set "clpConfig.presto.coordinator.logging_level=INFO" \
  --set "clpConfig.presto.coordinator.query_max_memory_gb=1" \
  --set "clpConfig.presto.coordinator.query_max_memory_per_node_gb=1" \
  --set "clpConfig.presto.worker.query_memory_gb=4" \
  --set "clpConfig.presto.worker.system_memory_gb=8" \
  --set-json 'clpConfig.presto.split_filter={}' \
  --set "clpConfig.api_server=null" \
  --set "clpConfig.query_scheduler=null" \
  --set "clpConfig.query_worker=null" \
  --set "clpConfig.reducer=null" \
  2>&1 | grep "^# Source:" | sort

Output:

# Source: clp/templates/compression-scheduler-deployment.yaml
# Source: clp/templates/compression-worker-deployment.yaml
# Source: clp/templates/configmap.yaml
# Source: clp/templates/database-secret.yaml
# Source: clp/templates/database-service.yaml
# Source: clp/templates/database-statefulset.yaml
# Source: clp/templates/db-table-creator-job.yaml
# Source: clp/templates/garbage-collector-deployment.yaml
# Source: clp/templates/job-watcher-rolebinding.yaml
# Source: clp/templates/job-watcher-role.yaml
# Source: clp/templates/job-watcher-serviceaccount.yaml
# Source: clp/templates/presto-coordinator-deployment.yaml
# Source: clp/templates/presto-coordinator-service.yaml
# Source: clp/templates/presto-worker-deployment.yaml
# Source: clp/templates/queue-secret.yaml
# Source: clp/templates/queue-service.yaml
# Source: clp/templates/queue-statefulset.yaml
# Source: clp/templates/redis-secret.yaml
# Source: clp/templates/redis-service.yaml
# Source: clp/templates/redis-statefulset.yaml
# Source: clp/templates/results-cache-indices-creator-job.yaml
# Source: clp/templates/results-cache-service.yaml
# Source: clp/templates/results-cache-statefulset.yaml
# Source: clp/templates/shared-data-archives-pvc.yaml
# Source: clp/templates/shared-data-streams-pvc.yaml
# Source: clp/templates/webui-deployment.yaml
# Source: clp/templates/webui-service.yaml

Explanation: Presto coordinator, service, and worker templates are rendered. Api-server, query
scheduler, query worker, and reducer templates are correctly excluded.

2. Helm template renders correctly (clp-s mode)

Task: Verify the clp-s query pipeline resources are rendered when query_engine is "clp-s",
and Presto resources are excluded.

Command:

helm template clp . --set clpConfig.package.query_engine="clp-s" 2>&1 | grep "^# Source:" | sort

Output:

# Source: clp/templates/api-server-deployment.yaml
# Source: clp/templates/api-server-service.yaml
# Source: clp/templates/compression-scheduler-deployment.yaml
# Source: clp/templates/compression-worker-deployment.yaml
# Source: clp/templates/configmap.yaml
# Source: clp/templates/database-secret.yaml
# Source: clp/templates/database-service.yaml
# Source: clp/templates/database-statefulset.yaml
# Source: clp/templates/db-table-creator-job.yaml
# Source: clp/templates/garbage-collector-deployment.yaml
# Source: clp/templates/job-watcher-rolebinding.yaml
# Source: clp/templates/job-watcher-role.yaml
# Source: clp/templates/job-watcher-serviceaccount.yaml
# Source: clp/templates/query-scheduler-deployment.yaml
# Source: clp/templates/query-scheduler-service.yaml
# Source: clp/templates/query-worker-deployment.yaml
# Source: clp/templates/queue-secret.yaml
# Source: clp/templates/queue-service.yaml
# Source: clp/templates/queue-statefulset.yaml
# Source: clp/templates/redis-secret.yaml
# Source: clp/templates/redis-service.yaml
# Source: clp/templates/redis-statefulset.yaml
# Source: clp/templates/reducer-deployment.yaml
# Source: clp/templates/reducer-service.yaml
# Source: clp/templates/results-cache-indices-creator-job.yaml
# Source: clp/templates/results-cache-service.yaml
# Source: clp/templates/results-cache-statefulset.yaml
# Source: clp/templates/shared-data-archives-pvc.yaml
# Source: clp/templates/shared-data-streams-pvc.yaml
# Source: clp/templates/webui-deployment.yaml
# Source: clp/templates/webui-service.yaml

Explanation: Api-server, query scheduler, query worker, reducer, and results cache templates are
rendered. Presto templates are correctly excluded.

3. Helm template renders correctly using the docs' presto-values.yaml

Task: Verify that a presto-values.yaml file created by following the guides-k8s-deployment.md
documentation renders without errors.

presto-values.yaml (following the docs, with placeholders replaced):

image:
  prestoCoordinator:
    repository: "ghcr.io/y-scope/presto/coordinator"
    tag: "dev"
  prestoWorker:
    repository: "ghcr.io/y-scope/presto/prestissimo-worker"
    tag: "dev"

prestoWorker:
  replicas: 2

clpConfig:
  package:
    storage_engine: "clp-s"
    query_engine: "presto"

  api_server: null
  query_scheduler: null
  query_worker: null
  reducer: null

  results_cache:
    retention_period: null

  presto:
    port: 30889
    coordinator:
      logging_level: "INFO"
      query_max_memory_gb: 1
      query_max_memory_per_node_gb: 1
    worker:
      query_memory_gb: 4
      system_memory_gb: 8
    split_filter:
      clp.default.default:
        - columnName: "timestamp"
          customOptions:
            rangeMapping:
              lowerBound: "begin_timestamp"
              upperBound: "end_timestamp"
          required: false

Command:

helm template clp . -f /tmp/presto-values.yaml 2>&1 | grep "^# Source:" | sort

Output:

# Source: clp/templates/compression-scheduler-deployment.yaml
# Source: clp/templates/compression-worker-deployment.yaml
# Source: clp/templates/configmap.yaml
# Source: clp/templates/database-secret.yaml
# Source: clp/templates/database-service.yaml
# Source: clp/templates/database-statefulset.yaml
# Source: clp/templates/db-table-creator-job.yaml
# Source: clp/templates/job-watcher-rolebinding.yaml
# Source: clp/templates/job-watcher-role.yaml
# Source: clp/templates/job-watcher-serviceaccount.yaml
# Source: clp/templates/presto-coordinator-deployment.yaml
# Source: clp/templates/presto-coordinator-service.yaml
# Source: clp/templates/presto-worker-deployment.yaml
# Source: clp/templates/queue-secret.yaml
# Source: clp/templates/queue-service.yaml
# Source: clp/templates/queue-statefulset.yaml
# Source: clp/templates/redis-secret.yaml
# Source: clp/templates/redis-service.yaml
# Source: clp/templates/redis-statefulset.yaml
# Source: clp/templates/results-cache-indices-creator-job.yaml
# Source: clp/templates/results-cache-service.yaml
# Source: clp/templates/results-cache-statefulset.yaml
# Source: clp/templates/shared-data-archives-pvc.yaml
# Source: clp/templates/shared-data-streams-pvc.yaml
# Source: clp/templates/webui-deployment.yaml
# Source: clp/templates/webui-service.yaml

Explanation: The template renders without errors using the exact values file from the docs
(with placeholders filled in). The garbage-collector is correctly excluded because
results_cache.retention_period is set to null and archive_output.retention_period defaults
to null, so there is nothing to garbage-collect.

4. Multi-node cluster: clp-s mode (default)

Task: Verify that the Helm chart deploys correctly in clp-s mode on a multi-node kind cluster
with dedicated worker nodes.

Command:

cd tools/deployment/package-helm
bash set-up-multi-dedicated-test.sh

Output (final pod status):

NAME                                              READY   STATUS      RESTARTS   AGE
test-clp-api-server-bbcc6684-nsx5k                1/1     Running     0          2m43s
test-clp-compression-scheduler-5c74dfd974-vtdk8   1/1     Running     0          2m43s
test-clp-compression-worker-57c58768b-k8gs5       1/1     Running     0          2m43s
test-clp-compression-worker-57c58768b-kgx4x       1/1     Running     0          2m43s
test-clp-database-0                               1/1     Running     0          2m43s
test-clp-db-table-creator-77qv6                   0/1     Completed   0          2m43s
test-clp-garbage-collector-595f847dc-gqqsw        1/1     Running     0          2m43s
test-clp-query-scheduler-57879c8c7b-lw2fj         1/1     Running     0          2m43s
test-clp-query-worker-7fc9f65988-flmrm            1/1     Running     0          2m43s
test-clp-query-worker-7fc9f65988-ndp9x            1/1     Running     0          2m43s
test-clp-queue-0                                  1/1     Running     0          2m43s
test-clp-redis-0                                  1/1     Running     0          2m43s
test-clp-reducer-664c8545c8-dwhgm                 1/1     Running     0          2m43s
test-clp-reducer-664c8545c8-qzs9q                 1/1     Running     0          2m43s
test-clp-results-cache-0                          1/1     Running     0          2m43s
test-clp-results-cache-indices-creator-8wdmk      0/1     Completed   0          2m43s
test-clp-webui-dc49878f7-nfdpl                    1/1     Running     0          2m43s

Explanation: All clp-s pipeline pods are running (api-server, query-scheduler, 2x query-worker,
2x reducer, results-cache, etc.). No Presto pods are present.

Node scheduling verification:

=== Pod scheduling ===
test-clp-compression-worker-57c58768b-k8gs5   clp-test-worker    (compression node)
test-clp-compression-worker-57c58768b-kgx4x   clp-test-worker    (compression node)
test-clp-query-worker-7fc9f65988-flmrm        clp-test-worker4   (query node)
test-clp-query-worker-7fc9f65988-ndp9x        clp-test-worker4   (query node)

Compression workers are scheduled on compression-labeled nodes and query workers on query-labeled
nodes.

5. Multi-node cluster: Presto mode

Task: Verify that the Helm chart deploys correctly in Presto mode on a multi-node kind cluster.
The test script was temporarily modified to add -f /tmp/presto-values.yaml to the helm install
command and remove the query worker/reducer --set flags (since those components are not deployed in
Presto mode).

Command:

# Temporarily modified set-up-multi-dedicated-test.sh helm install to:
#   helm install test "${script_dir}" \
#       -f /tmp/presto-values.yaml \
#       --set "distributedDeployment=true" \
#       ...
bash set-up-multi-dedicated-test.sh

Output (final pod status):

NAME                                              READY   STATUS      RESTARTS   AGE
test-clp-compression-scheduler-5c74dfd974-jvlq8   1/1     Running     0          2m28s
test-clp-compression-worker-57c58768b-d26nl       1/1     Running     0          2m28s
test-clp-compression-worker-57c58768b-pqwvh       1/1     Running     0          2m28s
test-clp-database-0                               1/1     Running     0          2m28s
test-clp-db-table-creator-kz8dv                   0/1     Completed   0          2m28s
test-clp-presto-coordinator-77897595b5-jns8j      1/1     Running     0          2m28s
test-clp-presto-worker-794b756d5d-bs5hs           1/1     Running     0          2m28s
test-clp-queue-0                                  1/1     Running     0          2m28s
test-clp-redis-0                                  1/1     Running     0          2m28s
test-clp-results-cache-0                          1/1     Running     0          2m28s
test-clp-results-cache-indices-creator-dm4p8      0/1     Completed   0          2m28s
test-clp-webui-dc49878f7-nzx7h                    1/1     Running     0          2m28s
All jobs completed and services are ready.

Explanation: Presto coordinator and worker pods are running. Api-server, query-scheduler,
query-worker, and reducer pods are correctly absent. The Presto worker init container successfully
detected the coordinator version and initialized its config.

Presto pod verification:

=== Presto pods running ===
test-clp-presto-coordinator-77897595b5-jns8j      1/1     Running
test-clp-presto-worker-794b756d5d-bs5hs           1/1     Running

=== Query/reducer pods (should be NONE) ===
(none - correct)

=== Api-server pods (should be NONE) ===
(none - correct)

Compressed sample logs postgresql.jsonl, performed queries in the webui and observed results were
correctly returned.

  • Chores
    • Updated chart version

Summary by CodeRabbit

  • New Features

    • Added support for deploying Presto as an alternative query engine via Helm and Docker Compose
    • Deployable Presto coordinator and worker services with configurable replicas, images and ports
    • Presto-aware scheduling and node labelling options for dedicated/shared/compute pools
  • Documentation

    • Comprehensive Kubernetes and Docker Compose guides for Presto integration, installation steps, configuration examples (split-filter) and operational notes

@junhaoliao junhaoliao requested a review from a team as a code owner February 17, 2026 20:40
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds Presto integration across CLP: new Kubernetes and Docker Compose docs, Helm values and templates for Presto coordinator and workers, extensive conditional Presto ConfigMap entries, gating of legacy query components when query_engine == "presto", and setup script updates for node labeling, ports, and worker scaling.

Changes

Cohort / File(s) Summary
Documentation
docs/src/user-docs/guides-k8s-deployment.md, docs/src/user-docs/guides-using-presto.md
Adds Presto deployment guidance (presto-values.yaml, Helm install, split-filter, worker scheduling), Docker Compose setup, cross-references, and node labeling/scheduling examples.
Helm Templates — Presto Coordinator
tools/deployment/package-helm/templates/presto-coordinator-deployment.yaml, tools/deployment/package-helm/templates/presto-coordinator-service.yaml
New Deployment and NodePort Service templates for Presto coordinator (port 8889), conditional on .Values.clpConfig.presto, with config volume mounts and health probes.
Helm Templates — Presto Worker
tools/deployment/package-helm/templates/presto-worker-deployment.yaml
New Deployment for presto-worker (configurable replicas), init containers, config mounts, optional archives PVC when storage.type == "fs", probes, and scheduling hooks; gated by presto flag.
Helm Templates — Legacy Component Gating
tools/deployment/package-helm/templates/query-scheduler-deployment.yaml, .../query-scheduler-service.yaml, .../query-worker-deployment.yaml, .../reducer-deployment.yaml, .../reducer-service.yaml
Wrapped query-scheduler, query-worker, and reducer resources in conditionals to omit them when query_engine == "presto"; no other field changes.
Helm ConfigMap
tools/deployment/package-helm/templates/configmap.yaml
Inserted extensive conditional Presto configuration blocks (coordinator/worker catalogs, JVM/log/node properties, split-filter, setup scripts); added PrestoHost/PrestoPort to webui/mcp_server JSON when enabled and adjusted include context usage.
Helm Values & Chart
tools/deployment/package-helm/values.yaml, tools/deployment/package-helm/Chart.yaml
Added prestoCoordinator/prestoWorker image entries, prestoWorker replicas and scheduling block, clpConfig.presto top-level block and moved reducer under clpConfig; bumped chart version.
Setup Scripts
tools/deployment/package-helm/.set-up-common.sh, .../set-up-multi-dedicated-test.sh, .../set-up-multi-shared-test.sh
Exposed Kind port 30889; added NUM_PRESTO_NODES / PRESTO_WORKER_REPLICAS, labeled Presto nodes (nodeType=presto), adjusted worker counts and Helm install args to pass prestoWorker replicas and nodeType.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Helm
    participant KubeAPI as Kubernetes API
    participant PCoord as Presto Coordinator
    participant PWorker as Presto Worker
    participant CLP as CLP Components

    Client->>Helm: Install chart with clpConfig.presto
    Helm->>KubeAPI: Create ConfigMaps, Deployments, Services
    KubeAPI->>PCoord: Start coordinator pod (port 8889)
    KubeAPI->>PWorker: Start worker pods
    PWorker->>PCoord: Register with coordinator
    Client->>PCoord: Submit query (NodePort 8889)
    PCoord->>PWorker: Distribute query tasks
    PWorker->>CLP: Read catalog / split-filter
    PWorker-->>PCoord: Return results
    PCoord-->>Client: Return aggregated results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'feat(helm): Add Presto query engine support to the Helm chart.' clearly and specifically summarizes the main change—adding Presto as a query engine option to the Helm deployment chart, which aligns with the extensive additions across templates, documentation, and configuration files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/src/user-docs/guides-k8s-deployment.md`:
- Around line 359-362: Update the note text in the docs to correct the list of
components replaced when `query_engine` = "presto": remove "results cache" from
the excluded list and add "api-server"; reference the Helm gating logic when
editing the sentence so it reads that Presto replaces the "query scheduler,
query workers, reducers, and api-server" (since `query_scheduler`,
`query_workers`, and `reducers` are gated by `{{- if ne
.Values.clpConfig.package.query_engine \"presto\" }}` and the `api-server`
deployment is gated by `{{- if and .Values.clpConfig.api_server (ne
.Values.clpConfig.package.query_engine \"presto\") }}`), while `results-cache-*`
manifests are not conditionally removed.
- Around line 311-315: The prestoCoordinator and prestoWorker image definitions
currently use mutable tag values (tag: "dev"); update these to pinned/stable
release tags (e.g., tag: "vX.Y.Z") for both prestoCoordinator and prestoWorker
so deployments pull a fixed image; locate the tag fields under the
prestoCoordinator and prestoWorker blocks and replace "dev" with the chosen
semantic version string (or a chart value reference like {{
.Values.image.prestoCoordinator.tag }} and {{ .Values.image.prestoWorker.tag }}
if you prefer templating) to ensure reproducible deployments.

In `@docs/src/user-docs/guides-using-presto.md`:
- Around line 66-70: The docs show only a helm install command which will fail
when CLP is already deployed; update the step under the Presto install to
include the alternative helm upgrade invocation so users can perform an upgrade
when a release exists—specifically add guidance to run `helm upgrade clp clp/clp
DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml` as the alternative to `helm
install` (referencing the existing `helm install clp clp/clp
DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml` and the `presto-values.yaml`
values file).

In `@tools/deployment/package-helm/templates/configmap.yaml`:
- Around line 369-378: The JVM Xmx in presto-coordinator-config-jvm.config is
hardcoded as "-Xmx4G"; change it to render a Helm value so operators can
configure it: replace the literal "-Xmx4G" with a template that reads
.Values.clpConfig.presto.coordinator.jvm_max_heap (with a sensible default like
"4G"), ensure consumers pass the unit (e.g., "4G") or document it, and add the
new key plus a descriptive default and comment to values.yaml so the chart is
configurable and documented.
- Around line 347-348: The ConfigMap is embedding sensitive values
(clp.metadata-db-user, clp.metadata-db-password, clp.s3-access-key-id,
clp.s3-secret-access-key); remove these keys from templates/configmap.yaml and
instead create a Helm Secret template that stores those values (e.g.,
templates/secret.yaml with .Values.credentials.database and
.Values.credentials.s3 entries), then update consumers to read them from the
Secret (use env valueFrom.secretKeyRef, secret projection into a volume, or
template lookup/include to inject non-plain text values) so credentials are
stored as Kubernetes Secrets rather than literal ConfigMap entries.
- Line 397: The ConfigMap currently hardcodes AWS hostnames for the Presto S3
path via the clp.s3-end-point entry using {{ .s3_config.bucket }} and {{
.s3_config.region_code }}, which breaks S3-compatible stores; change the
template to prefer the configured endpoint_url on .s3_config (e.g.,
archive_output.storage.s3_config.endpoint_url / .s3_config.endpoint_url) when
present and fall back to the existing AWS-format URL only if endpoint_url is
empty, so Presto uses the explicit endpoint for MinIO/other S3-compatible
services.

In `@tools/deployment/package-helm/templates/presto-coordinator-deployment.yaml`:
- Around line 33-63: The presto-coordinator container lacks Kubernetes resource
requests/limits; add a configurable resources block for the container named
"presto-coordinator" (e.g., read values from a new or existing Helm key such as
.Values.prestoCoordinator.resources or .Values.resources.prestoCoordinator) and
populate both requests and limits for cpu and memory so they can be tuned
per-environment; update the chart values.yaml to include sensible defaults and
reference those values in the presto-coordinator deployment template.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml`:
- Around line 62-93: Add Kubernetes resource requests and limits to the
presto-worker container (container name "presto-worker") so CPU/memory are
enforced; update the deployment template to include a resources: block under
that container using Helm values (e.g. reference values like
.Values.resources.prestoWorker.requests.cpu,
.Values.resources.prestoWorker.requests.memory,
.Values.resources.prestoWorker.limits.cpu,
.Values.resources.prestoWorker.limits.memory or similar keys used elsewhere) and
ensure the memory request/limit aligns reasonably with the Presto config
(system_memory_gb/query_memory_gb) to avoid OOMKills; keep the block next to
ports/volumeMounts in the "presto-worker" container spec.
- Around line 38-55: The startup script that sets DISCOVERY_URI, calls wget into
VERSION and extracts PRESTO_VERSION can produce an empty PRESTO_VERSION (e.g.,
coordinator not ready) and currently would write a blank presto.version; update
the boot script (the block using set -e, DISCOVERY_URI, VERSION, PRESTO_VERSION)
to validate the coordinator response: implement a short retry loop with a
timeout that retries wget on ${DISCOVERY_URI}/v1/info a few times (or at least
reject empty output), extract PRESTO_VERSION only after non-empty VERSION, and
if PRESTO_VERSION is still empty log an error and exit non-zero (do not append a
blank presto.version); ensure the script does not proceed to echo
"presto.version=${PRESTO_VERSION}" unless PRESTO_VERSION is non-empty.
- Around line 69-86: The volume mount ordering for the Presto worker container
is subtle: the emptyDir mount named "presto-etc" at /opt/presto-server/etc must
be mounted before the child mounts ("presto-catalog" at
/opt/presto-server/etc/catalog and "presto-config" via subPath at
/opt/presto-server/etc/velox.properties) so init-container writes to the
emptyDir (config.properties, node.properties) remain visible; add a clear inline
comment near the volumeMounts block explaining this dependency and that
Kubernetes mounts parents before children (so do not reorder or replace
"presto-etc" with a deeper mount), referencing the mount names presto-etc,
presto-catalog, presto-config and the target path /opt/presto-server/etc.
- Around line 33-55: The init container "setup-configs" is using an unpinned
busybox image and a brittle sed-based JSON parse for PRESTO_VERSION; change the
image (in the "setup-configs" container spec) from busybox:latest to a pinned
image that includes jq (or make the tag configurable via values.yaml), and
replace the sed pipeline that sets PRESTO_VERSION (the commands that assign
VERSION and PRESTO_VERSION) to use a robust JSON parser like jq (e.g., fetch
DISCOVERY_URI "/v1/info" and pipe to jq -r .version) and add a simple check to
fail or default if PRESTO_VERSION is empty so the echo "presto.version=..." step
is reliable.

In `@tools/deployment/package-helm/values.yaml`:
- Around line 20-27: The Helm values set mutable dev tags for Presto images:
update prestoCoordinator.tag and prestoWorker.tag to use a stable release tag
convention (e.g., match clpPackage.tag which uses "main" or a semantic version)
so defaults are deterministic; modify the values for prestoCoordinator and
prestoWorker to a stable tag string and document in values.yaml comment that
operators should override tag for development workflows, ensuring consistency
across clpPackage, prestoCoordinator, and prestoWorker.

Comment on lines +359 to +362
:::{note}
When `query_engine` is set to `"presto"`, the chart deploys a Presto coordinator and Presto
worker(s) instead of the query scheduler, query workers, reducers, and results cache.
:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for conditional logic around results_cache, api-server, and other components
# in the Helm templates to determine what's excluded when query_engine="presto"

echo "=== Searching for results_cache conditional logic ==="
rg -n 'results.cache|resultsCache' tools/deployment/package-helm/templates/ \
  -A 5 -B 5 --type yaml

echo -e "\n=== Searching for api-server conditional logic ==="
rg -n 'api.server|apiServer|api_server' tools/deployment/package-helm/templates/ \
  -A 5 -B 5 --type yaml

echo -e "\n=== Searching for query_engine conditional logic ==="
rg -n 'query_engine|queryEngine' tools/deployment/package-helm/templates/ \
  -A 5 -B 5 --type yaml

Repository: y-scope/clp

Length of output: 50367


Note is inaccurate: results_cache should not be listed as excluded, and api-server is missing.

The note states Presto replaces "the query scheduler, query workers, reducers, and results cache", but:

  1. The Helm templates show no conditional gating on query_engine for any results-cache-* manifests (service, statefulset, indices-creator job), confirming it remains deployed when Presto is enabled — only with retention disabled.
  2. The api-server deployment is actually gated by {{- if and .Values.clpConfig.api_server (ne .Values.clpConfig.package.query_engine "presto") }}, so it is excluded and should be mentioned.
  3. The query scheduler, query workers, and reducers are all gated by {{- if ne .Values.clpConfig.package.query_engine "presto" }}, confirming they are excluded.

Update the note to list: query scheduler, query workers, reducers, and api-server—not results cache.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/user-docs/guides-k8s-deployment.md` around lines 359 - 362, Update
the note text in the docs to correct the list of components replaced when
`query_engine` = "presto": remove "results cache" from the excluded list and add
"api-server"; reference the Helm gating logic when editing the sentence so it
reads that Presto replaces the "query scheduler, query workers, reducers, and
api-server" (since `query_scheduler`, `query_workers`, and `reducers` are gated
by `{{- if ne .Values.clpConfig.package.query_engine \"presto\" }}` and the
`api-server` deployment is gated by `{{- if and .Values.clpConfig.api_server (ne
.Values.clpConfig.package.query_engine \"presto\") }}`), while `results-cache-*`
manifests are not conditionally removed.

Comment on lines +66 to +70
2. Install (or upgrade) the Helm chart with the Presto values:

```bash
helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

helm install will fail for users with an existing CLP deployment.

The requirement at line 33 explicitly states the user already has "a running CLP Kubernetes deployment," yet step 2 only shows helm install, which would error out with release already exists. The parenthetical "(or upgrade)" acknowledges the upgrade path but never shows the corresponding command.

💡 Suggested fix
-2. Install (or upgrade) the Helm chart with the Presto values:
+2. If you haven't installed CLP yet, install it with Presto enabled:
 
    ```bash
    helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
    ```
+
+   If CLP is already installed, upgrade the release instead:
+
+   ```bash
+   helm upgrade clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
+   ```
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
2. Install (or upgrade) the Helm chart with the Presto values:
```bash
helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f presto-values.yaml
```
2. If you haven't installed CLP yet, install it with Presto enabled:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/user-docs/guides-using-presto.md` around lines 66 - 70, The docs
show only a helm install command which will fail when CLP is already deployed;
update the step under the Presto install to include the alternative helm upgrade
invocation so users can perform an upgrade when a release exists—specifically
add guidance to run `helm upgrade clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f
presto-values.yaml` as the alternative to `helm install` (referencing the
existing `helm install clp clp/clp DOCS_VAR_HELM_VERSION_FLAG -f
presto-values.yaml` and the `presto-values.yaml` values file).

Comment on lines +347 to +348
clp.metadata-db-user={{ $.Values.credentials.database.username }}
clp.metadata-db-password={{ $.Values.credentials.database.password }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Credentials stored in plaintext in a ConfigMap — use a Secret instead.

clp.metadata-db-password (and clp.metadata-db-user) are written directly into the ConfigMap. ConfigMaps are stored unencrypted in etcd by default, are readable by any principal with get/list on ConfigMaps in the namespace, and are fully visible in kubectl get configmap -o yaml. Kubernetes Secrets are the appropriate resource for sensitive values: they can be encrypted at rest, carry separate RBAC controls, and are omitted from audit logs by default.

The same concern applies to the S3 credentials at lines 396/398 (clp.s3-access-key-id, clp.s3-secret-access-key).

Consider rendering these files from a Kubernetes Secret (or multiple projected volume mounts) rather than embedding credentials in the ConfigMap.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/configmap.yaml` around lines 347 -
348, The ConfigMap is embedding sensitive values (clp.metadata-db-user,
clp.metadata-db-password, clp.s3-access-key-id, clp.s3-secret-access-key);
remove these keys from templates/configmap.yaml and instead create a Helm Secret
template that stores those values (e.g., templates/secret.yaml with
.Values.credentials.database and .Values.credentials.s3 entries), then update
consumers to read them from the Secret (use env valueFrom.secretKeyRef, secret
projection into a volume, or template lookup/include to inject non-plain text
values) so credentials are stored as Kubernetes Secrets rather than literal
ConfigMap entries.

Comment on lines +369 to +378
presto-coordinator-config-jvm.config: |
-server
-Xmx4G
-XX:+UseG1GC
-XX:G1HeapRegionSize=32M
-XX:+UseGCOverheadLimit
-XX:+ExplicitGCInvokesConcurrent
-XX:+HeapDumpOnOutOfMemoryError
-XX:+ExitOnOutOfMemoryError
-Djdk.attach.allowAttachSelf=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Hardcoded -Xmx4G should be a configurable value.

The coordinator JVM heap is fixed at 4 GB regardless of the resource limits/requests set on the coordinator pod. If the pod is scheduled with less memory, it will OOM-kill; if scheduled with more, the heap cannot grow to use it. Expose this as a value (e.g., .Values.clpConfig.presto.coordinator.jvm_max_heap) so operators can tune it alongside query.max-memory and query.max-memory-per-node.

♻️ Suggested approach
  presto-coordinator-config-jvm.config: |
    -server
-   -Xmx4G
+   -Xmx{{ .coordinator.jvm_max_heap_gb }}G
    -XX:+UseG1GC
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/configmap.yaml` around lines 369 -
378, The JVM Xmx in presto-coordinator-config-jvm.config is hardcoded as
"-Xmx4G"; change it to render a Helm value so operators can configure it:
replace the literal "-Xmx4G" with a template that reads
.Values.clpConfig.presto.coordinator.jvm_max_heap (with a sensible default like
"4G"), ensure consumers pass the unit (e.g., "4G") or document it, and add the
new key plus a descriptive default and comment to values.yaml so the chart is
configurable and documented.

Comment on lines +33 to +55
- name: "setup-configs"
image: "busybox:latest"
command:
- "/bin/sh"
- "-c"
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties

# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties

# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fragile JSON parsing with sed and unpinned busybox:latest image.

Two concerns in this init container:

  1. busybox:latest — Using the latest tag hurts reproducibility and can introduce unexpected breakages. Pin to a specific version (e.g., busybox:1.37), or make it configurable via values.

  2. JSON parsing via sed (Lines 46–48) — Extracting version from a JSON response using sed 's/.*"version":"//' | sed 's/".*//' is brittle. If the JSON structure changes (e.g., field ordering, nested version keys, or whitespace variations), parsing will silently produce incorrect results. Since busybox doesn't include jq, consider switching to an image that has jq, or using a more robust awk/grep pattern.

Proposed fix for image tag pinning and safer parsing
         - name: "setup-configs"
-          image: "busybox:latest"
+          image: "busybox:1.37"
           command:
             - "/bin/sh"
             - "-c"
             - |
               set -e
               cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
               cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties

               # Query coordinator for version and add it to config
               DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
               VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
-              PRESTO_VERSION=$(echo "$VERSION" \
-                | sed 's/.*"version":"//' \
-                | sed 's/".*//')
+              PRESTO_VERSION=$(echo "$VERSION" \
+                | grep -o '"version":"[^"]*"' \
+                | head -1 \
+                | sed 's/"version":"//;s/"//')
               echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 33 - 55, The init container "setup-configs" is using an unpinned busybox
image and a brittle sed-based JSON parse for PRESTO_VERSION; change the image
(in the "setup-configs" container spec) from busybox:latest to a pinned image
that includes jq (or make the tag configurable via values.yaml), and replace the
sed pipeline that sets PRESTO_VERSION (the commands that assign VERSION and
PRESTO_VERSION) to use a robust JSON parser like jq (e.g., fetch DISCOVERY_URI
"/v1/info" and pipe to jq -r .version) and add a simple check to fail or default
if PRESTO_VERSION is empty so the echo "presto.version=..." step is reliable.

Comment on lines +38 to +55
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties

# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties

# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

No error handling if the coordinator version query fails.

If wget on Line 45 fails (e.g., coordinator not yet ready, network issue), $VERSION will be empty and $PRESTO_VERSION will be an empty string. This would write presto.version= to the config, likely causing the worker to fail with a confusing error. While set -e is present, wget returning an empty body with HTTP 200 would not trigger it.

Consider adding a validation check:

Proposed validation
               VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
               PRESTO_VERSION=$(echo "$VERSION" \
                 | sed 's/.*"version":"//' \
                 | sed 's/".*//')
+              if [ -z "$PRESTO_VERSION" ]; then
+                echo "ERROR: Failed to retrieve Presto version from coordinator" >&2
+                exit 1
+              fi
               echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties
# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties
# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties
# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
if [ -z "$PRESTO_VERSION" ]; then
echo "ERROR: Failed to retrieve Presto version from coordinator" >&2
exit 1
fi
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties
# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 38 - 55, The startup script that sets DISCOVERY_URI, calls wget into
VERSION and extracts PRESTO_VERSION can produce an empty PRESTO_VERSION (e.g.,
coordinator not ready) and currently would write a blank presto.version; update
the boot script (the block using set -e, DISCOVERY_URI, VERSION, PRESTO_VERSION)
to validate the coordinator response: implement a short retry loop with a
timeout that retries wget on ${DISCOVERY_URI}/v1/info a few times (or at least
reject empty output), extract PRESTO_VERSION only after non-empty VERSION, and
if PRESTO_VERSION is still empty log an error and exit non-zero (do not append a
blank presto.version); ensure the script does not proceed to echo
"presto.version=${PRESTO_VERSION}" unless PRESTO_VERSION is non-empty.

Comment on lines +62 to +93
containers:
- name: "presto-worker"
image: "{{ .Values.image.prestoWorker.repository }}:{{ .Values.image.prestoWorker.tag }}"
imagePullPolicy: "{{ .Values.image.prestoWorker.pullPolicy }}"
ports:
- name: "presto-worker"
containerPort: 8080
volumeMounts:
- name: "presto-catalog"
mountPath: "/opt/presto-server/etc/catalog"
readOnly: true
- name: "presto-config"
mountPath: "/opt/presto-server/etc/velox.properties"
subPath: "presto-worker-config-velox.properties"
readOnly: true
- name: "presto-etc"
mountPath: "/opt/presto-server/etc"
{{- if eq .Values.clpConfig.archive_output.storage.type "fs" }}
- name: {{ include "clp.volumeName" (dict
"component_category" "shared-data"
"name" "archives"
) | quote }}
mountPath: "/var/data/archives"
readOnly: true
{{- end }}
readinessProbe:
{{- include "clp.readinessProbeTimings" . | nindent 12 }}
tcpSocket: &presto-worker-health-check
port: "presto-worker"
livenessProbe:
{{- include "clp.livenessProbeTimings" . | nindent 12 }}
tcpSocket: *presto-worker-health-check
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

No resources (requests/limits) defined on the worker container.

Same concern as the coordinator: the worker container has no CPU/memory resource requests or limits. This is especially important for Presto workers, which can be memory-intensive. The system_memory_gb and query_memory_gb values are set in the Presto config but not enforced at the Kubernetes level, which could lead to OOMKills or overcommitment.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 62 - 93, Add Kubernetes resource requests and limits to the presto-worker
container (container name "presto-worker") so CPU/memory are enforced; update
the deployment template to include a resources: block under that container using
Helm values (e.g. reference values like
.Values.resources.prestoWorker.requests.cpu,
.Values.resources.prestoWorker.requests.memory,
.Values.resources.prestoWorker.limits.cpu,
.Values.resources.prestoWorker.limits.memory or similar keys used elsewhere) and
ensure the memory request/limit aligns reasonably with the Presto config
(system_memory_gb/query_memory_gb) to avoid OOMKills; keep the block next to
ports/volumeMounts in the "presto-worker" container spec.

Comment on lines +69 to +86
volumeMounts:
- name: "presto-catalog"
mountPath: "/opt/presto-server/etc/catalog"
readOnly: true
- name: "presto-config"
mountPath: "/opt/presto-server/etc/velox.properties"
subPath: "presto-worker-config-velox.properties"
readOnly: true
- name: "presto-etc"
mountPath: "/opt/presto-server/etc"
{{- if eq .Values.clpConfig.archive_output.storage.type "fs" }}
- name: {{ include "clp.volumeName" (dict
"component_category" "shared-data"
"name" "archives"
) | quote }}
mountPath: "/var/data/archives"
readOnly: true
{{- end }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Volume mount layering at /opt/presto-server/etc — verify correctness.

The main container mounts:

  1. presto-catalog (configMap) at /opt/presto-server/etc/catalog
  2. presto-config (configMap, subPath) at /opt/presto-server/etc/velox.properties
  3. presto-etc (emptyDir) at /opt/presto-server/etc

Kubernetes sorts mounts by path length (parent before child), so the emptyDir is mounted first, then the catalog and velox.properties are overlaid. The init container's writes (config.properties, node.properties) live in the emptyDir and should remain visible. This layering should work, but it's subtle and easy to break during future refactors.

Consider adding a comment in the template explaining the mount ordering dependency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 69 - 86, The volume mount ordering for the Presto worker container is
subtle: the emptyDir mount named "presto-etc" at /opt/presto-server/etc must be
mounted before the child mounts ("presto-catalog" at
/opt/presto-server/etc/catalog and "presto-config" via subPath at
/opt/presto-server/etc/velox.properties) so init-container writes to the
emptyDir (config.properties, node.properties) remain visible; add a clear inline
comment near the volumeMounts block explaining this dependency and that
Kubernetes mounts parents before children (so do not reorder or replace
"presto-etc" with a deeper mount), referencing the mount names presto-etc,
presto-catalog, presto-config and the target path /opt/presto-server/etc.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/deployment/package-helm/Chart.yaml`:
- Line 3: The Chart.yaml pre-release version was mistakenly advanced to
"0.1.4-dev.4", skipping "0.1.4-dev.3"; update the version field in Chart.yaml
from "0.1.4-dev.4" to "0.1.4-dev.3" so the pre-release sequence remains
contiguous (look for the existing version: "0.1.4-dev.4" and replace it with
"0.1.4-dev.3").

apiVersion: "v2"
name: "clp"
version: "0.1.4-dev.2"
version: "0.1.4-dev.4"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check whether 0.1.4-dev.3 is used in any branch or tag
git grep "0.1.4-dev.3" $(git for-each-ref --format='%(refname)' refs/heads refs/remotes refs/tags) -- "tools/deployment/package-helm/Chart.yaml" 2>/dev/null || echo "Not found on any ref"

Repository: y-scope/clp

Length of output: 75


Change version to 0.1.4-dev.3 to maintain sequential pre-release versioning.

The version skips from 0.1.4-dev.2 directly to 0.1.4-dev.4, but 0.1.4-dev.3 does not exist on any branch or tag in the repository. This skip appears unintentional and breaks the contiguous pre-release sequence.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/Chart.yaml` at line 3, The Chart.yaml
pre-release version was mistakenly advanced to "0.1.4-dev.4", skipping
"0.1.4-dev.3"; update the version field in Chart.yaml from "0.1.4-dev.4" to
"0.1.4-dev.3" so the pre-release sequence remains contiguous (look for the
existing version: "0.1.4-dev.4" and replace it with "0.1.4-dev.3").

@@ -1,4 +1,4 @@
{{- if .Values.clpConfig.api_server }}
{{- if and .Values.clpConfig.api_server (ne .Values.clpConfig.package.query_engine "presto") }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we disable api server? Shouldn't we also disable query scheduler and worker if clp search is not needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently, the API server does not work with Presto.

we do have this specified in tools/deployment/package-helm/templates/query-scheduler-deployment.yaml as well

{{- if ne .Values.clpConfig.package.query_engine "presto" }}

though, given some users may want to use both the API Server the Presto, i think we may discard .Values.clpConfig.package.query_engine values and simply check whether query_scheduler / query_worker / API is null before launching the services. what do you think?

cc @kirkrodrigues for discussion since this is interface related

@@ -17,6 +17,14 @@ image:
repository: "ghcr.io/y-scope/clp/clp-package"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting Presto internal error, possibly due to the recent timestamp changes f82e611. Can you reproduce the error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 right. i can reproduce once i merge main into the current branch.

@coderabbitai note #1788 is a BREAKING change that broke CLP Presto, as in the CLP Presto Connector cannot handle the new Timestamp column type at the moment. submit an issue to track

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Get details about PR `#1788` to understand the breaking change
gh pr view 1788 --json title,url,mergedAt,body --repo y-scope/clp

# Create an issue to track the breaking change
gh issue create \
  --repo y-scope/clp \
  --title "Presto Connector cannot handle new Timestamp column type from PR `#1788`" \
  --body "## Issue Description

The CLP Presto Connector currently cannot handle the new Timestamp column type introduced in PR `#1788`, causing internal errors when querying with Presto.

## Context

- **Breaking PR**: `#1788` (commit f82e6114160a6addd4727259906bcf621ac9912c)
- **Affected Component**: CLP Presto Connector
- **Impact**: Presto queries fail with internal errors

The issue was discovered during testing of PR `#2004` (Presto Helm chart support). The error can be reproduced after merging main into the Presto feature branch.

## References

- PR `#1788`: https://github.com/y-scope/clp/pull/1788
- PR `#2004`: https://github.com/y-scope/clp/pull/2004
- Comment: https://github.com/y-scope/clp/pull/2004#discussion_r2821243815

Reported by: `@junhaoliao`" \
  --assignee junhaoliao

Length of output: 5337


@junhaoliao, I've created issue #2008 to track this breaking change. The issue documents that PR #1788 (which introduced the new Timestamp column type to replace DateString and bumped the archive version to 0.5.0) broke the CLP Presto Connector's ability to handle the new column type. The issue has been assigned to you and includes references to both PRs and the relevant commit.

🐰✨
/_/
( o.o )

^ <
"Issue tracked!"


🧠 Learnings used
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1818
File: tools/deployment/package-helm/templates/configmap.yaml:12-12
Timestamp: 2025-12-19T05:03:42.629Z
Learning: In the y-scope/clp Helm chart, the `api_server.host` configuration field in the ConfigMap (tools/deployment/package-helm/templates/configmap.yaml) is unused because no other k8s internal services need to reach the API server—it's only accessed from outside the cluster via NodePort.

If you found this review helpful, would you consider giving us a shout-out on X?

Thank you for using CodeRabbit!

Comment on lines +36 to +55
- "/bin/sh"
- "-c"
- |
set -e
cp /etc/presto-config/presto-worker-config-config.properties /opt/presto-server/etc/config.properties
cp /etc/presto-config/presto-worker-config-node.properties /opt/presto-server/etc/node.properties

# Query coordinator for version and add it to config
DISCOVERY_URI="http://{{ include "clp.fullname" . }}-presto-coordinator:8080"
VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info")
PRESTO_VERSION=$(echo "$VERSION" \
| sed 's/.*"version":"//' \
| sed 's/".*//')
echo "presto.version=${PRESTO_VERSION}" >> /opt/presto-server/etc/config.properties

# Set node identity from hostname
HOSTNAME=$(hostname)
HOST_IP=$(hostname -i)
echo "node.internal-address=${HOST_IP}" >> /opt/presto-server/etc/node.properties
echo "node.id=${HOSTNAME}" >> /opt/presto-server/etc/node.properties
Copy link
Contributor

@hoophalab hoophalab Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script looks scary. Does it mimic the functionality of tools/deployment/presto-clp/worker/scripts/generate-configs.sh? I guess we don't use the script directly because of missing dependencies?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can't reuse generate-configs.sh directly because it also does apt-get, env-var substitution, and catalog management, all of which are unnecessary in Helm since configmap/volume mounts handle that

extracting the shared bits is possible but the indirection cost (.Files.Get, symlinks into the chart dir) outweighs the benefit for such a small surface

- Replace `query_engine` conditions with modular flags in Helm templates.
- Refactor Presto configurations to use dedicated flags and scripts.
- Enhance worker and reducer node scheduling logic.
- Update documentation to reflect Presto integration and better configuration flexibility.
@junhaoliao
Copy link
Member Author

TODO: address #1957 since we're updating the Presto setup docs in this PR.

@junhaoliao junhaoliao added this to the March 2026 milestone Mar 7, 2026
junhaoliao and others added 3 commits March 11, 2026 01:00
Resolve conflicts:
- Chart.yaml: Accept main's version (0.2.1-dev.0)
- values.yaml: Keep prestoWorker section, drop storage.storageClassName
  (host-path PVs removed on main)
- set-up-multi-{dedicated,shared}-test.sh: Combine Presto node support with
  main's get_image_helm_args
- guides-k8s-deployment.md: Keep Presto link references
- Accept deletion of all host-path PV/PVC files (query-scheduler-logs,
  query-worker-logs, query-worker-staged-streams, reducer-logs)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (2)
docs/src/user-docs/guides-using-presto.md (1)

33-36: ⚠️ Potential issue | 🟡 Minor

The Helm flow still contradicts itself for existing releases.

This section says the Kubernetes path requires a running CLP deployment, but the only command shown here is helm install, which will fail once that release already exists. Either drop the prerequisite for fresh installs, or show the matching helm upgrade command beside this step.

Also applies to: 83-87

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/user-docs/guides-using-presto.md` around lines 33 - 36, The
"Requirements" section currently states a prerequisite of "A running CLP
Kubernetes deployment" but the docs only show a `helm install` command which
fails for existing releases; update the documentation under the "Requirements"
heading (and the similar block at lines referenced for the second occurrence) to
either remove the running-deployment prerequisite for fresh installs or,
preferably, replace/augment the shown `helm install` with the idempotent Helm
command and an explicit upgrade example — e.g., show `helm upgrade --install
<release> <chart> ...` (or show both `helm install ...` and the corresponding
`helm upgrade` example) next to the existing `helm install` so the guide works
for both fresh and existing releases.
tools/deployment/package-helm/templates/presto-worker-deployment.yaml (1)

33-35: ⚠️ Potential issue | 🟠 Major

Pin the setup image and stop relying on :latest for the shell toolchain.

setup-configs no longer uses kubectl, but it does depend on whatever bitnami/kubectl:latest happens to ship for sh, cp, wget, and jq. That makes worker start-up non-reproducible and easy to break on upstream image changes. Please move this image/tag into values and pin it to a known build that includes the exact tools the script needs.

Does the `bitnami/kubectl` container image include `jq` and `wget`, and which pinned tags are currently published so Helm charts can avoid using `:latest`?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml` around
lines 33 - 35, Replace the hardcoded image "bitnami/kubectl:latest" for the
"setup-configs" container with a chart value (e.g. .Values.setupConfig.image and
.Values.setupConfig.tag or a single .Values.setupConfig.imageWithTag) and set
the values.yaml default to a pinned tag/digest that you have verified contains
/bin/sh, cp, wget and jq; keep the command as ["/bin/sh",
"/scripts/presto-worker-setup-configs.sh"]. Update chart README/values
description to note the image must include jq and wget and add a comment
reminding future updaters to pin to a specific tag or digest rather than
:latest. Ensure the template uses the new values symbol names so upgrades can
change the image via values.yaml.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/src/user-docs/guides-k8s-deployment.md`:
- Line 664: Remove the unused markdown link reference "[logging-infra-issue]"
from the document: locate the link definition line that reads
"[logging-infra-issue]: https://github.com/y-scope/clp/issues/1760" and delete
it (and verify there are no remaining usages of the identifier in the file such
as "[logging-infra-issue]" inline links); this resolves the MD053 markdownlint
warning.

In `@tools/deployment/package-helm/set-up-multi-dedicated-test.sh`:
- Around line 13-17: NUM_PRESTO_NODES is defaulting to 2 and is counted in
total_workers even when Presto mode is not enabled, leaving unused Presto nodes;
change the default to 0 (NUM_PRESTO_NODES="${NUM_PRESTO_NODES:-0}") or,
alternatively, when NUM_PRESTO_NODES or PRESTO_WORKER_REPLICAS is set >0 switch
the Helm install/values into Presto mode so pods can be scheduled; update the
same defaults/logic for the other occurrences (lines referenced around 37-38 and
58-78) and ensure total_workers calculation and kind node pool labeling only
include Presto nodes when Presto mode is enabled.

In `@tools/deployment/package-helm/set-up-multi-shared-test.sh`:
- Line 15: The PRESTO_WORKER_REPLICAS variable is defined but never applied to
the Helm values, so setting it only changes output; update
set-up-multi-shared-test.sh to pass Presto-specific values when
PRESTO_WORKER_REPLICAS is set by explicitly setting
clpConfig.package.query_engine=presto and ensuring clpConfig.presto is non-null
and prestoWorker.replicas uses PRESTO_WORKER_REPLICAS in the Helm
install/upgrade values block (the install command that currently relies on chart
defaults). Locate references to PRESTO_WORKER_REPLICAS, prestoWorker.replicas,
clpConfig.package.query_engine, and clpConfig.presto in the script and add the
corresponding Helm --set entries (or values YAML injection) so the chart
actually deploys Presto with the requested replica count.

In `@tools/deployment/package-helm/templates/configmap.yaml`:
- Around line 129-130: The configmix allows mismatched flags causing
split-brain: introduce a single predicate variable (e.g., $prestoEnabled := eq
.Values.clpConfig.package.query_engine "presto") at the top of the Helm template
and use that variable everywhere that currently mixes
.Values.clpConfig.package.query_engine, .Values.clpConfig.query_*, and
.Values.clpConfig.presto; replace the existing conditionals in the CLP-S legacy
blocks (the sections referencing ClpQueryEngine, PrestoHost, presto/PrestoHost)
and the Presto blocks with checks against $prestoEnabled so both the configmap
entries and the Presto deployment templates are gated by the same boolean.
Ensure every place noted in the review (around package.query_engine and the
blocks spanning lines ~132-158, ~227-233, ~326-334, and the single-line block
~337) uses that single $prestoEnabled predicate.
- Around line 388-395: The S3 block under the with
$.Values.clpConfig.archive_output.storage blindly dereferences
.s3_config.aws_authentication.credentials.{access_key_id,secret_access_key}
which breaks for non-static auth shapes; update the template around the eq .type
"s3" branch to validate that .s3_config.aws_authentication is the
static/credentials form (e.g. credentials exist and no session_token/profile
type) and call the Helm fail function with a clear message if it is not, or
alternatively only render the clp.s3-access-key-id/clp.s3-secret-access-key and
clp.s3-auth-provider=clp_package properties when the credentials fields are
present; reference the $.Values.clpConfig.archive_output.storage /
.s3_config.aws_authentication and the clp.s3-access-key-id /
clp.s3-secret-access-key symbols when making the change.

In `@tools/deployment/package-helm/templates/query-scheduler-deployment.yaml`:
- Line 1: The deployment/service templates currently only check
.Values.clpConfig.query_scheduler which stays truthy when package.query_engine
is changed to "presto"; update the top-level conditional in both
query-scheduler-deployment.yaml and query-scheduler-service.yaml to also ensure
.Values.package.query_engine is not "presto" (e.g., add an AND check that
.Values.package.query_engine != "presto" using Helm's and/ne helpers) so the
scheduler is not rendered when query_engine is set to "presto".

---

Duplicate comments:
In `@docs/src/user-docs/guides-using-presto.md`:
- Around line 33-36: The "Requirements" section currently states a prerequisite
of "A running CLP Kubernetes deployment" but the docs only show a `helm install`
command which fails for existing releases; update the documentation under the
"Requirements" heading (and the similar block at lines referenced for the second
occurrence) to either remove the running-deployment prerequisite for fresh
installs or, preferably, replace/augment the shown `helm install` with the
idempotent Helm command and an explicit upgrade example — e.g., show `helm
upgrade --install <release> <chart> ...` (or show both `helm install ...` and
the corresponding `helm upgrade` example) next to the existing `helm install` so
the guide works for both fresh and existing releases.

In `@tools/deployment/package-helm/templates/presto-worker-deployment.yaml`:
- Around line 33-35: Replace the hardcoded image "bitnami/kubectl:latest" for
the "setup-configs" container with a chart value (e.g. .Values.setupConfig.image
and .Values.setupConfig.tag or a single .Values.setupConfig.imageWithTag) and
set the values.yaml default to a pinned tag/digest that you have verified
contains /bin/sh, cp, wget and jq; keep the command as ["/bin/sh",
"/scripts/presto-worker-setup-configs.sh"]. Update chart README/values
description to note the image must include jq and wget and add a comment
reminding future updaters to pin to a specific tag or digest rather than
:latest. Ensure the template uses the new values symbol names so upgrades can
change the image via values.yaml.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 480ad11d-e1af-46f0-a62d-35394755466a

📥 Commits

Reviewing files that changed from the base of the PR and between 8089254 and 5fc55fc.

📒 Files selected for processing (11)
  • docs/src/user-docs/guides-k8s-deployment.md
  • docs/src/user-docs/guides-using-presto.md
  • tools/deployment/package-helm/.set-up-common.sh
  • tools/deployment/package-helm/Chart.yaml
  • tools/deployment/package-helm/set-up-multi-dedicated-test.sh
  • tools/deployment/package-helm/set-up-multi-shared-test.sh
  • tools/deployment/package-helm/templates/configmap.yaml
  • tools/deployment/package-helm/templates/presto-coordinator-deployment.yaml
  • tools/deployment/package-helm/templates/presto-coordinator-service.yaml
  • tools/deployment/package-helm/templates/presto-worker-deployment.yaml
  • tools/deployment/package-helm/templates/query-scheduler-deployment.yaml

[kind]: https://kind.sigs.k8s.io/
[kubeadm]: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
[kubectl]: https://kubernetes.io/docs/tasks/tools/
[logging-infra-issue]: https://github.com/y-scope/clp/issues/1760
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove the stale link reference.

[logging-infra-issue] is not referenced anywhere in this document, so markdownlint will keep flagging MD053 until it is deleted.

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 664-664: Link and image reference definitions should be needed
Unused link or image reference definition: "logging-infra-issue"

(MD053, link-image-reference-definitions)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/src/user-docs/guides-k8s-deployment.md` at line 664, Remove the unused
markdown link reference "[logging-infra-issue]" from the document: locate the
link definition line that reads "[logging-infra-issue]:
https://github.com/y-scope/clp/issues/1760" and delete it (and verify there are
no remaining usages of the identifier in the file such as
"[logging-infra-issue]" inline links); this resolves the MD053 markdownlint
warning.

Comment on lines +13 to +17
NUM_PRESTO_NODES="${NUM_PRESTO_NODES:-2}"
COMPRESSION_WORKER_REPLICAS="${COMPRESSION_WORKER_REPLICAS:-2}"
QUERY_WORKER_REPLICAS="${QUERY_WORKER_REPLICAS:-2}"
REDUCER_REPLICAS="${REDUCER_REPLICAS:-2}"
PRESTO_WORKER_REPLICAS="${PRESTO_WORKER_REPLICAS:-2}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

The default dedicated test now allocates unused Presto nodes.

NUM_PRESTO_NODES is included in total_workers, but this script still installs the chart without enabling Presto mode. In the default path that grows the kind cluster and labels a Presto node pool that no pod can use. Default the Presto node count to 0, or switch the Helm install into Presto mode when these settings are present.

Also applies to: 37-38, 58-78

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/set-up-multi-dedicated-test.sh` around lines 13
- 17, NUM_PRESTO_NODES is defaulting to 2 and is counted in total_workers even
when Presto mode is not enabled, leaving unused Presto nodes; change the default
to 0 (NUM_PRESTO_NODES="${NUM_PRESTO_NODES:-0}") or, alternatively, when
NUM_PRESTO_NODES or PRESTO_WORKER_REPLICAS is set >0 switch the Helm
install/values into Presto mode so pods can be scheduled; update the same
defaults/logic for the other occurrences (lines referenced around 37-38 and
58-78) and ensure total_workers calculation and kind node pool labeling only
include Presto nodes when Presto mode is enabled.

COMPRESSION_WORKER_REPLICAS="${COMPRESSION_WORKER_REPLICAS:-2}"
QUERY_WORKER_REPLICAS="${QUERY_WORKER_REPLICAS:-2}"
REDUCER_REPLICAS="${REDUCER_REPLICAS:-2}"
PRESTO_WORKER_REPLICAS="${PRESTO_WORKER_REPLICAS:-2}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This Presto knob is inert unless the script also enables Presto mode.

The install command still uses the chart defaults, so prestoWorker.replicas has no effect unless this script also sets clpConfig.package.query_engine=presto and a non-null clpConfig.presto. As written, the new variable only changes the banner output, not the workload this script actually exercises.

Also applies to: 28-28, 41-46

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/set-up-multi-shared-test.sh` at line 15, The
PRESTO_WORKER_REPLICAS variable is defined but never applied to the Helm values,
so setting it only changes output; update set-up-multi-shared-test.sh to pass
Presto-specific values when PRESTO_WORKER_REPLICAS is set by explicitly setting
clpConfig.package.query_engine=presto and ensuring clpConfig.presto is non-null
and prestoWorker.replicas uses PRESTO_WORKER_REPLICAS in the Helm
install/upgrade values block (the install command that currently relies on chart
defaults). Locate references to PRESTO_WORKER_REPLICAS, prestoWorker.replicas,
clpConfig.package.query_engine, and clpConfig.presto in the script and add the
corresponding Helm --set entries (or values YAML injection) so the chart
actually deploys Presto with the requested replica count.

Comment on lines 129 to 130
package:
query_engine: {{ .Values.clpConfig.package.query_engine | quote }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Gate both engine branches off the same flag.

This file now mixes package.query_engine, with .Values.clpConfig.query_*, and if .Values.clpConfig.presto. That can render split-brain config: ClpQueryEngine: "presto" while presto/PrestoHost are null, or the CLP-S sections still exist because their values are present. Please derive one $prestoEnabled := eq .Values.clpConfig.package.query_engine "presto" flag and use it consistently for both the legacy query blocks and the Presto blocks. The same predicate should also gate the Presto deployment templates.

Possible template shape
+{{- $prestoEnabled := eq .Values.clpConfig.package.query_engine "presto" }}
 ...
-    {{- with .Values.clpConfig.query_scheduler }}
+    {{- if not $prestoEnabled }}
+    {{- with .Values.clpConfig.query_scheduler }}
     query_scheduler:
       ...
     {{- end }}
+    {{- end }}
 ...
-    {{- if .Values.clpConfig.presto }}
+    {{- if $prestoEnabled }}
     presto:
       host: "{{ include "clp.fullname" . }}-presto-coordinator"
       port: 8889
     {{- else }}
     presto: null
     {{- end }}
 ...
-{{- with .Values.clpConfig.presto }}
+{{- if $prestoEnabled }}
+{{- with .Values.clpConfig.presto }}
   presto-coordinator-catalog-clp.properties: |
     ...
+{{- end }}
+{{- end }}

Also applies to: 132-158, 227-233, 326-334, 337-337

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/configmap.yaml` around lines 129 -
130, The configmix allows mismatched flags causing split-brain: introduce a
single predicate variable (e.g., $prestoEnabled := eq
.Values.clpConfig.package.query_engine "presto") at the top of the Helm template
and use that variable everywhere that currently mixes
.Values.clpConfig.package.query_engine, .Values.clpConfig.query_*, and
.Values.clpConfig.presto; replace the existing conditionals in the CLP-S legacy
blocks (the sections referencing ClpQueryEngine, PrestoHost, presto/PrestoHost)
and the Presto blocks with checks against $prestoEnabled so both the configmap
entries and the Presto deployment templates are gated by the same boolean.
Ensure every place noted in the review (around package.query_engine and the
blocks spanning lines ~132-158, ~227-233, ~326-334, and the single-line block
~337) uses that single $prestoEnabled predicate.

Comment on lines +388 to +395
{{- with $.Values.clpConfig.archive_output.storage }}
{{- if eq .type "s3" }}
clp.storage-type=s3
clp.s3-auth-provider=clp_package
clp.s3-access-key-id={{ .s3_config.aws_authentication.credentials.access_key_id }}
clp.s3-end-point=https://{{ .s3_config.bucket }}.s3.{{ .s3_config.region_code }}.amazonaws.com/
clp.s3-secret-access-key={{ .s3_config.aws_authentication.credentials.secret_access_key }}
{{- end }}{{/* if eq .type "s3" */}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast when Presto S3 is given a non-static auth config.

This block always dereferences aws_authentication.credentials.access_key_id/secret_access_key, but the same chart still allows other auth shapes elsewhere (type: profile, optional session_token). In Presto mode those inputs will either render broken connector config or silently drop required auth data. Either reject non-static auth here with fail, or plumb the supported auth mode explicitly into the connector properties.

Possible guard
   {{- with $.Values.clpConfig.archive_output.storage }}
   {{- if eq .type "s3" }}
+  {{- $auth := .s3_config.aws_authentication | default dict }}
+  {{- if not $auth.credentials }}
+  {{- fail "Presto S3 mode currently requires archive_output.storage.s3_config.aws_authentication.credentials" }}
+  {{- end }}
     clp.storage-type=s3
     clp.s3-auth-provider=clp_package
-    clp.s3-access-key-id={{ .s3_config.aws_authentication.credentials.access_key_id }}
+    clp.s3-access-key-id={{ $auth.credentials.access_key_id }}
     clp.s3-end-point=https://{{ .s3_config.bucket }}.s3.{{ .s3_config.region_code }}.amazonaws.com/
-    clp.s3-secret-access-key={{ .s3_config.aws_authentication.credentials.secret_access_key }}
+    clp.s3-secret-access-key={{ $auth.credentials.secret_access_key }}
   {{- end }}{{/* if eq .type "s3" */}}
   {{- end }}{{/* with $.Values.clpConfig.archive_output.storage */}}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/configmap.yaml` around lines 388 -
395, The S3 block under the with $.Values.clpConfig.archive_output.storage
blindly dereferences
.s3_config.aws_authentication.credentials.{access_key_id,secret_access_key}
which breaks for non-static auth shapes; update the template around the eq .type
"s3" branch to validate that .s3_config.aws_authentication is the
static/credentials form (e.g. credentials exist and no session_token/profile
type) and call the Helm fail function with a clear message if it is not, or
alternatively only render the clp.s3-access-key-id/clp.s3-secret-access-key and
clp.s3-auth-provider=clp_package properties when the credentials fields are
present; reference the $.Values.clpConfig.archive_output.storage /
.s3_config.aws_authentication and the clp.s3-access-key-id /
clp.s3-secret-access-key symbols when making the change.

@@ -1,3 +1,4 @@
{{- if .Values.clpConfig.query_scheduler }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find and inspect the values.yaml to confirm clpConfig.query_scheduler default
fd -t f "values.yaml" tools/deployment/package-helm/

Repository: y-scope/clp

Length of output: 96


🏁 Script executed:

# Read the values.yaml file to check query_scheduler and query_engine defaults
cat -n tools/deployment/package-helm/values.yaml | grep -A 10 -B 2 "query_scheduler"

Repository: y-scope/clp

Length of output: 602


🏁 Script executed:

# Also check the query_engine setting location
cat -n tools/deployment/package-helm/values.yaml | grep -A 5 -B 5 "query_engine"

Repository: y-scope/clp

Length of output: 326


🏁 Script executed:

# Read the actual query-scheduler-deployment.yaml to see current gating
cat -n tools/deployment/package-helm/templates/query-scheduler-deployment.yaml | head -20

Repository: y-scope/clp

Length of output: 764


🏁 Script executed:

# Read the query-scheduler-service.yaml to check its gating
cat -n tools/deployment/package-helm/templates/query-scheduler-service.yaml | head -20

Repository: y-scope/clp

Length of output: 638


Also gate the scheduler off when query_engine is set to "presto".

clpConfig.query_scheduler is a non-null map by default, so this condition stays truthy when a user only flips package.query_engine to "presto". The scheduler still renders in Presto mode, breaking the query-engine switch. The same fix is needed in query-scheduler-service.yaml.

Suggested fix
-{{- if .Values.clpConfig.query_scheduler }}
+{{- if and .Values.clpConfig.query_scheduler (ne .Values.clpConfig.package.query_engine "presto") }}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
{{- if .Values.clpConfig.query_scheduler }}
{{- if and .Values.clpConfig.query_scheduler (ne .Values.clpConfig.package.query_engine "presto") }}
🧰 Tools
🪛 YAMLlint (1.38.0)

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/templates/query-scheduler-deployment.yaml` at
line 1, The deployment/service templates currently only check
.Values.clpConfig.query_scheduler which stays truthy when package.query_engine
is changed to "presto"; update the top-level conditional in both
query-scheduler-deployment.yaml and query-scheduler-service.yaml to also ensure
.Values.package.query_engine is not "presto" (e.g., add an AND check that
.Values.package.query_engine != "presto" using Helm's and/ne helpers) so the
scheduler is not rendered when query_engine is set to "presto".

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tools/deployment/package-helm/values.yaml (1)

68-75: ⚠️ Potential issue | 🟠 Major

Keep the reducer toleration aligned with the new query node selector.

Line 70 moves reducers onto yscope.io/nodeType: "query", but Line 74 still shows yscope.io/dedicated: "reducer". If operators copy this example onto tainted query nodes, reducer pods will stay Pending because they will not tolerate the taint.

Proposed fix
  #   tolerations:
  #     - key: "yscope.io/dedicated"
  #       operator: "Equal"
- #       value: "reducer"
+ #       value: "query"
  #       effect: "NoSchedule"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tools/deployment/package-helm/values.yaml` around lines 68 - 75, The
toleration block is mismatched with the nodeSelector: nodeSelector uses
yscope.io/nodeType: "query" but tolerations still reference yscope.io/dedicated:
"reducer", causing reducer pods to stay Pending; update the tolerations under
scheduling to match the taint key/value used on query nodes (e.g., change the
toleration key to "yscope.io/nodeType" and value to "query", keep operator
"Equal" and effect "NoSchedule"), or alternatively align the nodeSelector back
to reducer — modify the keys/values in the tolerations stanza (tolerations, key,
value, operator, effect) to be consistent with the nodeSelector.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tools/deployment/package-helm/values.yaml`:
- Around line 196-209: The Presto values block is missing documentation that
users must also set clpConfig.package.query_engine to "presto" (instead of the
default "clp-s") to switch the CLP search flow; update the Presto section in
values.yaml to add a short note instructing users to set
clpConfig.package.query_engine: "presto" (or to change the existing
clpConfig.package.query_engine value) when enabling presto: <...> so that
clpConfig.presto is not treated as self-contained and the Web UI uses the Presto
search flow rather than the default "clp-s".

---

Outside diff comments:
In `@tools/deployment/package-helm/values.yaml`:
- Around line 68-75: The toleration block is mismatched with the nodeSelector:
nodeSelector uses yscope.io/nodeType: "query" but tolerations still reference
yscope.io/dedicated: "reducer", causing reducer pods to stay Pending; update the
tolerations under scheduling to match the taint key/value used on query nodes
(e.g., change the toleration key to "yscope.io/nodeType" and value to "query",
keep operator "Equal" and effect "NoSchedule"), or alternatively align the
nodeSelector back to reducer — modify the keys/values in the tolerations stanza
(tolerations, key, value, operator, effect) to be consistent with the
nodeSelector.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4b9dff88-375b-4629-a788-f3f9dc195653

📥 Commits

Reviewing files that changed from the base of the PR and between 6611a46 and 4ac6a71.

📒 Files selected for processing (1)
  • tools/deployment/package-helm/values.yaml

Copy link
Contributor

@hoophalab hoophalab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments

presto-coordinator-catalog-clp.properties: |
connector.name=clp
clp.metadata-provider-type=mysql
clp.metadata-db-url=jdbc:mysql://{{ include "clp.fullname" $ }}-database:3306
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use {{ include "clp.databaseHost" . }} to support un-bundled database?

{{- include "clp.labels" . | nindent 8 }}
app.kubernetes.io/component: "presto-worker"
spec:
serviceAccountName: {{ include "clp.fullname" . }}-job-watcher
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

job-watcher is renamed


::::

### Using Presto as the query engine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this section duplicates with guides-using-presto.md. Shall we use a reference link?

[kind]: https://kind.sigs.k8s.io/
[kubeadm]: https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
[kubectl]: https://kubernetes.io/docs/tasks/tools/
[logging-infra-issue]: https://github.com/y-scope/clp/issues/1760
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not used

Comment on lines +426 to +427
PRESTO_VERSION=$(wget -q -O - "${DISCOVERY_URI}/v1/info" | jq -r '.version')
echo "presto.version=${PRESTO_VERSION}" >> "${PRESTO_CONFIG_DIR}/config.properties"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bitnami/kubectl:latest doesn't have wget and this command fails on my machine. As a result, the worker doesn't boot correctly. Not sure if I mis-configured something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants