feat(clp-package): Add anonymous usage telemetry framework and opt-out consent model#2079
feat(clp-package): Add anonymous usage telemetry framework and opt-out consent model#2079Nathan903 wants to merge 31 commits intoy-scope:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an anonymous telemetry system: new api-server telemetry module spawned as a detached background loop; config models and templates for telemetry consent/disable; startup scripts and controller propagate telemetry env and first-run prompt; Helm values/templates and docs updated. Changes
Sequence Diagram(s)sequenceDiagram
participant Startup as ApiServer Start
participant Config as Config / Env
participant Telemetry as Telemetry Task
participant Endpoint as Telemetry Endpoint
Startup->>Config: read clp-config + env vars
Startup->>Telemetry: spawn run_telemetry_loop(config) (detached)
Telemetry->>Config: is_telemetry_disabled?
alt telemetry enabled
Telemetry->>Endpoint: POST deployment_start event
loop every 24h
Telemetry->>Endpoint: POST heartbeat event
end
else telemetry disabled
Telemetry-->>Telemetry: exit
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…server and client-side integration.
53df39a to
1f9a3eb
Compare
There was a problem hiding this comment.
im not sure if this change is correct.
Without this addition of ,ignore, running cargo test --workspace would fail
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
components/clp-package-utils/clp_package_utils/controller.py (1)
1182-1190:⚠️ Potential issue | 🔴 CriticalIntegration test validation will fail with the new 36-character UUID format.
The instance-id file format has changed from a 4-character hexadecimal string to a full 36-character UUID. The integration test at
integration-tests/tests/utils/config.py:268has a strict regex validationr"[0-9a-fA-F]{4}"that expects exactly 4 characters, which will now fail when reading the file containing the full UUID.Additionally, the Presto script at
tools/deployment/presto-clp/scripts/init.py:414-438reads and uses the instance-id file directly without validation, so it will now receive a 36-character UUID instead of the previously expected 4-character ID. Verify whether the Presto network configuration logic can handle the full UUID format.The truncation at
controller.py:995(instance_id[-4:]) will continue to work by extracting the last 4 characters, but other consumers expecting the original 4-character format will break.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/clp-package-utils/clp_package_utils/controller.py` around lines 1182 - 1190, The instance-id file now contains a 36-char UUID which breaks consumers expecting a 4-hex ID; modify the creation logic around resolved_instance_id_file_path so the file continues to store the original 4-character hex ID (e.g., generate uuid.uuid4().hex[-4:] or format a 4-char random hex) and ensure reading strips newlines when assigning instance_id, so existing truncation code that uses instance_id[-4:] and the integration-test regex r"[0-9a-fA-F]{4}" keep working; additionally, audit the Presto consumer (the init.py reading the instance-id) and either validate/accept 4-char IDs or update its logic to handle the longer format if you intend to migrate, but do not change the file format here without updating tests and Presto.components/api-server/Cargo.toml (1)
4-4:⚠️ Potential issue | 🟡 MinorAdd explicit
rust-versionfield to enforce Rust 2024 edition compatibility.The
edition = "2024"requires Rust 1.85+. Addrust-version = "1.85"to each Cargo.toml (root, api-server, clp-rust-utils, log-ingestor) to explicitly declare the minimum supported Rust version. This communicates the requirement to tooling and contributors. The Cargo.lock sync verification is already in place viatask deps:lock:check-rust.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/api-server/Cargo.toml` at line 4, Add an explicit rust-version = "1.85" field to the Cargo.toml files that declare edition = "2024" so the minimum Rust toolchain is enforced; update the root Cargo.toml and the crate Cargo.toml files for api-server, clp-rust-utils, and log-ingestor to include rust-version = "1.85" alongside edition = "2024" to communicate the required Rust 1.85+ target to tooling and contributors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/api-server/src/bin/api_server.rs`:
- Around line 73-74: Telemetry is being spawned too early
(tokio::spawn(api_server::telemetry::run_telemetry_loop(config))) and may emit
deployment_start before router construction succeeds; move the telemetry startup
so it runs only after api_server::routes::from_client(client)? completes
successfully — i.e., construct the router first, handle/propagate any error from
from_client, then spawn run_telemetry_loop(config) (cloning or moving config as
needed) so telemetry only starts when server initialization has succeeded.
In `@components/api-server/src/telemetry.rs`:
- Around line 107-143: The telemetry loop in run_telemetry_loop currently blocks
forever; change its signature to accept a cancellation token (e.g.,
tokio_util::sync::CancellationToken) and use tokio::select! inside the periodic
loop to race between cancel.cancelled() and
tokio::time::sleep(TELEMETRY_SEND_INTERVAL) so you break out and return when
cancelled and only call build_event/send_event on the sleep branch. Update any
callers that spawn run_telemetry_loop to pass a CancellationToken (or provide an
overload/adapter) and keep the existing client creation and start_event logic
unchanged.
- Line 8: The constant TELEMETRY_SEND_INTERVAL uses Duration::from_hours which
requires Rust 1.91.0 and thus breaks MSRV; either set rust-version = "1.91.0"
under the [package] section in components/api-server/Cargo.toml to raise the
declared MSRV, or replace Duration::from_hours(24) with an MSRV-safe expression
(e.g., Duration::from_secs(24 * 60 * 60) or Duration::from_secs(86_400)) in
telemetry.rs so the crate remains compatible with older compilers; update
whichever location (telemetry.rs's TELEMETRY_SEND_INTERVAL or Cargo.toml)
accordingly.
In `@components/clp-package-utils/clp_package_utils/controller.py`:
- Around line 655-660: The code incorrectly derives the VERSION file location
from self._clp_config.logs_directory.parent which breaks when logs_directory is
configured elsewhere; change the lookup to use the CLP installation/home path
from the config (e.g., self._clp_config.clp_home or self._clp_config.home) when
resolving the VERSION file with resolve_host_path_in_container, and fall back to
the existing logs_directory.parent only if no explicit CLP home attribute
exists; update the code that sets env_vars["CLP_VERSION"] accordingly (refer to
resolve_host_path_in_container, self._clp_config.logs_directory, CLP_VERSION and
the VERSION file check).
- Line 995: The controller currently sets self._project_name =
f"clp-package-{instance_id[-4:]}" which truncates the instance ID and causes
Docker network name mismatch with the Presto init.py that expects the full
instance ID; update the assignment of self._project_name in the controller to
use the full instance_id (e.g., f"clp-package-{instance_id}") so Docker Compose
network names match Presto's network construction, keeping the same project-name
prefix.
In `@components/package-template/src/sbin/start-clp.sh`:
- Around line 62-68: Check for an existing "telemetry:" key in the file
referenced by clp_config_path and if present replace or update the block rather
than blindly appending; if absent append a single telemetry block. Implement
this by searching the file for a line matching "^telemetry:" (or the YAML block
start) and using a conditional path that either replaces the existing telemetry
block (e.g., with sed/updating the block identified by "telemetry:" and its
indented lines) or appends the block. When writing/creating clp_config_path use
a grouped redirect (here referenced by clp_config_path and the "telemetry:"
block) to avoid SC2129 instead of repeating >> for each echo. Ensure the logic
handles non‑existent file by creating it with the telemetry block using a single
redirect.
- Around line 31-33: The grep is too permissive and matches commented or
indented keys; update the check that currently uses grep -q "telemetry:" to only
match an uncommented top-level key by using a start-of-line anchored pattern
(e.g. grep -q -E '^telemetry:' "$clp_config_path" 2>/dev/null) and keep setting
telemetry_prompt_needed=false; replace the existing grep invocation with this
anchored regex so it ignores lines like "# telemetry:" or nested/indented keys.
- Line 83: Split the combined export-and-command-assignment to avoid masking
uname's exit status: replace the single line export CLP_HOST_ARCH="$(uname -m)"
with an explicit declaration/export followed by the assignment, e.g. export
CLP_HOST_ARCH then CLP_HOST_ARCH="$(uname -m)"; keep the same variable name
CLP_HOST_ARCH and use uname -m for the assignment so the shell can detect
failures (addresses SC2155).
- Around line 27-30: The shell check for CLP_DISABLE_TELEMETRY is case-sensitive
and only matches "true", causing mismatch with the Rust API which accepts any
case; update the conditional in start-clp.sh that sets telemetry_prompt_needed
(and the related branch that checks DO_NOT_TRACK) to perform a case-insensitive
comparison for CLP_DISABLE_TELEMETRY (e.g., normalize CLP_DISABLE_TELEMETRY to
lowercase via parameter expansion or explicitly compare against common casings)
so that values like "TRUE" or "True" are treated the same as "true" and the
telemetry_prompt_needed logic aligns with the Rust eq_ignore_ascii_case
behavior.
In `@docs/src/user-docs/reference-telemetry.md`:
- Around line 96-103: Replace the plain path references in the "Source code"
list with clickable links: convert each individual file path
(components/api-server/src/telemetry.rs,
components/package-template/src/sbin/start-clp.sh, clp-config.yaml entry for
telemetry.disable) to relative repository links so they open the file in the
docs view, and convert the directory reference (clp-telemetry-server/) into a
versioned GitHub URL that includes DOCS_VAR_CLP_GIT_REF so it stays
reference-agnostic; update the markdown in reference-telemetry.md to use these
relative file links and the DOCS_VAR_CLP_GIT_REF-based GitHub URL for the
directory.
---
Outside diff comments:
In `@components/api-server/Cargo.toml`:
- Line 4: Add an explicit rust-version = "1.85" field to the Cargo.toml files
that declare edition = "2024" so the minimum Rust toolchain is enforced; update
the root Cargo.toml and the crate Cargo.toml files for api-server,
clp-rust-utils, and log-ingestor to include rust-version = "1.85" alongside
edition = "2024" to communicate the required Rust 1.85+ target to tooling and
contributors.
In `@components/clp-package-utils/clp_package_utils/controller.py`:
- Around line 1182-1190: The instance-id file now contains a 36-char UUID which
breaks consumers expecting a 4-hex ID; modify the creation logic around
resolved_instance_id_file_path so the file continues to store the original
4-character hex ID (e.g., generate uuid.uuid4().hex[-4:] or format a 4-char
random hex) and ensure reading strips newlines when assigning instance_id, so
existing truncation code that uses instance_id[-4:] and the integration-test
regex r"[0-9a-fA-F]{4}" keep working; additionally, audit the Presto consumer
(the init.py reading the instance-id) and either validate/accept 4-char IDs or
update its logic to handle the longer format if you intend to migrate, but do
not change the file format here without updating tests and Presto.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c6746284-f115-425c-ae17-d66c3be1e062
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
components/api-server/Cargo.tomlcomponents/api-server/src/bin/api_server.rscomponents/api-server/src/lib.rscomponents/api-server/src/telemetry.rscomponents/clp-package-utils/clp_package_utils/controller.pycomponents/clp-py-utils/clp_py_utils/clp_config.pycomponents/clp-rust-utils/src/clp_config/package/config.rscomponents/clp-rust-utils/src/database/mysql.rscomponents/package-template/src/etc/clp-config.template.json.yamlcomponents/package-template/src/etc/clp-config.template.text.yamlcomponents/package-template/src/sbin/start-clp.shdocs/src/user-docs/index.mddocs/src/user-docs/reference-telemetry.mdtools/deployment/package-helm/Chart.yamltools/deployment/package-helm/templates/NOTES.txttools/deployment/package-helm/templates/api-server-deployment.yamltools/deployment/package-helm/templates/configmap.yamltools/deployment/package-helm/values.yaml
…sing `CancellationToken`.
…d task using `CancellationToken`." This reverts commit 03dca64.
…_directory.parent`.
…ne when disabling telemetry.
…er, clp-rust-utils, and log-ingestor.
…acter hex IDs for new instances.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
docs/src/user-docs/reference-telemetry.md (1)
100-103:⚠️ Potential issue | 🟡 MinorUse repo-relative links for these in-repo source references.
These file links still point at hard-coded GitHub URLs, so the rendered docs will not stay aligned with the checked-out revision. Please switch the in-repo file references here to relative links.
Based on learnings: In Sphinx-based project docs, use relative links for individual files within the repository, but for directory links (e.g., links to a directory like tools/deployment/package-helm/), use GitHub URLs that include DOCS_VAR_CLP_GIT_REF to keep references version-agnostic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/src/user-docs/reference-telemetry.md` around lines 100 - 103, Update the two hard-coded GitHub links in docs/src/user-docs/reference-telemetry.md so they use repo-relative file paths instead of full GitHub URLs: replace the link target pointing to components/api-server/src/telemetry.rs and the link target pointing to components/package-template/src/sbin/start-clp.sh with relative links (e.g., ./components/api-server/src/telemetry.rs and ./components/package-template/src/sbin/start-clp.sh or simply components/...) so the rendered docs reference the checked-out revision; leave any directory-level links that intentionally require version-agnostic GitHub URLs (using DOCS_VAR_CLP_GIT_REF) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/clp-package-utils/clp_package_utils/controller.py`:
- Around line 662-669: The pass-through environment variable tuple that includes
"CLP_HOST_OS", "CLP_HOST_OS_VERSION", "CLP_HOST_ARCH", "CLP_DISABLE_TELEMETRY",
and "CLP_TELEMETRY_DEBUG" must also forward the "DO_NOT_TRACK" variable into the
API server container so telemetry respects the user's preference; update the
tuple/list in the loop that iterates over those vars (the block containing the
for var in (...) loop) to include "DO_NOT_TRACK" so the API server sees it at
startup.
In `@components/package-template/src/sbin/start-clp.sh`:
- Around line 36-38: The startup script start-clp.sh currently checks only
${CLP_HOME}/var/log/instance-id, which misses when users override
clp_config.logs_directory used by get_or_create_instance_id in
components/clp-package-utils/clp_package_utils/controller.py; update the guard
to check the configured logs directory (read from the same config or environment
variable used by the package utilities, e.g. a CLP_LOGS_DIR or sourcing the
package config) and look for <logs_directory>/instance-id instead of the
hardcoded ${CLP_HOME}/var/log/instance-id so the telemetry_prompt_needed flag is
correctly suppressed when the marker exists.
- Line 59: The read prompt for telemetry (the read -r -p ... populating
telemetry_response) can cause the script to exit under set -o errexit when it
receives EOF; change the read handling so that if read fails (EOF) you set
telemetry_response="" instead of letting the script exit, so the subsequent
regex check that treats empty/blank as acceptance still applies; locate the read
invocation and add conditional handling around it (or use a construct that
assigns empty on read failure) referencing the telemetry_response variable and
the following acceptance regex check.
---
Duplicate comments:
In `@docs/src/user-docs/reference-telemetry.md`:
- Around line 100-103: Update the two hard-coded GitHub links in
docs/src/user-docs/reference-telemetry.md so they use repo-relative file paths
instead of full GitHub URLs: replace the link target pointing to
components/api-server/src/telemetry.rs and the link target pointing to
components/package-template/src/sbin/start-clp.sh with relative links (e.g.,
./components/api-server/src/telemetry.rs and
./components/package-template/src/sbin/start-clp.sh or simply components/...) so
the rendered docs reference the checked-out revision; leave any directory-level
links that intentionally require version-agnostic GitHub URLs (using
DOCS_VAR_CLP_GIT_REF) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: e2ef240b-d9ae-4f63-8eec-c80f965a2502
📒 Files selected for processing (3)
components/clp-package-utils/clp_package_utils/controller.pycomponents/package-template/src/sbin/start-clp.shdocs/src/user-docs/reference-telemetry.md
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
components/clp-package-utils/clp_package_utils/controller.py (1)
663-669:⚠️ Potential issue | 🟠 MajorForward
DO_NOT_TRACKinto the API server container.This allowlist still drops
DO_NOT_TRACK, so that opt-out path will not be honoured once the API server starts.Proposed fix
for var in ( "CLP_HOST_OS", "CLP_HOST_OS_VERSION", "CLP_HOST_ARCH", "CLP_DISABLE_TELEMETRY", + "DO_NOT_TRACK", "CLP_TELEMETRY_DEBUG", ):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@components/clp-package-utils/clp_package_utils/controller.py` around lines 663 - 669, The current allowlist tuple in the loop that forwards environment vars to the API server (the tuple containing "CLP_HOST_OS", "CLP_HOST_OS_VERSION", "CLP_HOST_ARCH", "CLP_DISABLE_TELEMETRY", "CLP_TELEMETRY_DEBUG") omits DO_NOT_TRACK so the opt-out flag isn't propagated; add "DO_NOT_TRACK" to that tuple so the loop in controller.py forwards DO_NOT_TRACK into the API server container, ensuring the opt-out path is honoured.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@components/clp-package-utils/clp_package_utils/controller.py`:
- Around line 1184-1186: The instance ID generation currently truncates to 4 hex
chars (instance_id = uuid.uuid4().hex[-4:]), yielding only 65k possibilities and
high collision risk; change the logic around instance_id assignment (the
variable instance_id used before writing to resolved_instance_id_file_path) to
use a full UUID (e.g., uuid.uuid4().hex) or a longer slice (e.g., 16+ hex chars)
to provide much higher entropy before writing it to
resolved_instance_id_file_path so per-instance telemetry remains unique.
---
Duplicate comments:
In `@components/clp-package-utils/clp_package_utils/controller.py`:
- Around line 663-669: The current allowlist tuple in the loop that forwards
environment vars to the API server (the tuple containing "CLP_HOST_OS",
"CLP_HOST_OS_VERSION", "CLP_HOST_ARCH", "CLP_DISABLE_TELEMETRY",
"CLP_TELEMETRY_DEBUG") omits DO_NOT_TRACK so the opt-out flag isn't propagated;
add "DO_NOT_TRACK" to that tuple so the loop in controller.py forwards
DO_NOT_TRACK into the API server container, ensuring the opt-out path is
honoured.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2b0f604c-adcd-449b-b7d7-db9658a61404
📒 Files selected for processing (4)
components/api-server/Cargo.tomlcomponents/clp-package-utils/clp_package_utils/controller.pycomponents/clp-rust-utils/Cargo.tomlcomponents/log-ingestor/Cargo.toml
Description
Introduces an anonymous, privacy-preserving usage telemetry framework to help understand community deployment trends and prioritize development. This PR establishes the baseline collection framework, consent mechanisms, and the telemetry endpoint server.
start-clp.sh.CLP_DISABLE_TELEMETRYandDO_NOT_TRACKenvironment variables.api-server(Rust/Axum) that transmits a minimal, versioned JSON payload asynchronously.clp-telemetry-server(Rust + PostgreSQL) to receive and validate telemetry events.values.yamland a deployment notice inNOTES.txt.Checklist
breaking change.
Validation performed
Summary by CodeRabbit
New Features
Documentation
Chores