CLI: use library collectors#339
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
use metrics_utility.library.collectors, via cli_* wrappers cli collectors get since, until, output params, passed to library, and used to specify collector output exposed all exisiting collectors in the cli limit_slicer becomes until_slicer, using until over now() full_sync removed, unused register description removed, unused Issue: AAP-66077
library collectors now get an output= param,
defaulting to either `DataframeOutput`, where `output.sql(db, query)` returns a pandas dataframe,
or to `DictOutput`, where `output.dict({...})` returns a dict.
for the CLI calls, `CollectorOutput` gets passed instead, initialized with a temporary path for files,
and outputs a filelist from output.sql, and a dict from output.dict.
this allows preserving the csv streaming for the cli, while keeping the
collector output a dataframe for anonymized
|
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (5)
metrics_utility/test/library/test_collectors_execution_environments.py (2)
21-22: Docstring refers to old function name.The docstring still mentions "copy_table" but the test now patches
_copy_table_pandas. Consider updating for consistency.📝 Suggested fix
`@patch`('metrics_utility.library.collectors.util._copy_table_pandas') def test_execution_environments_calls_copy_table(mock_copy_pandas): - """Test that execution_environments calls copy_table with correct parameters.""" + """Test that execution_environments calls _copy_table_pandas with correct parameters."""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@metrics_utility/test/library/test_collectors_execution_environments.py` around lines 21 - 22, Update the test docstring to reflect the current patched function name: replace the outdated reference to "copy_table" with "_copy_table_pandas" in the docstring of test_execution_environments_calls_copy_table so the description matches the mocked symbol used in the test.
46-46: Inconsistent mock return value type.This mock returns an empty list
[]while other tests returnpd.DataFrame(). For consistency and to catch type-related issues, consider returning an empty DataFrame.📝 Suggested fix
- mock_copy_pandas.return_value = [] + mock_copy_pandas.return_value = pd.DataFrame()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@metrics_utility/test/library/test_collectors_execution_environments.py` at line 46, The test sets mock_copy_pandas.return_value to an empty list which is inconsistent with other tests that return pandas DataFrame objects; change the mock to return an empty DataFrame (pd.DataFrame()) instead, ensuring the test file imports pandas as pd if not already and update any assertions that assume a DataFrame type to remain valid; target the mock named mock_copy_pandas in the test_collectors_execution_environments.py test.metrics_utility/test/library/test_collectors_main_indirectmanagednodeaudit.py (1)
26-27: Docstring refers to old function name.Similar to other test files, the docstring mentions "copy_table" but the test patches
_copy_table_pandas.📝 Suggested fix
`@patch`('metrics_utility.library.collectors.util._copy_table_pandas') def test_main_indirectmanagednodeaudit_calls_copy_table(mock_copy_pandas): - """Test that main_indirectmanagednodeaudit calls copy_table.""" + """Test that main_indirectmanagednodeaudit calls _copy_table_pandas."""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@metrics_utility/test/library/test_collectors_main_indirectmanagednodeaudit.py` around lines 26 - 27, The test's docstring is out-of-date: update the docstring in test_main_indirectmanagednodeaudit_calls_copy_table to reflect the actual patched function name (_copy_table_pandas) instead of "copy_table"; locate the test by its name and the mock parameter mock_copy_pandas and change the docstring text to accurately describe that the test verifies main_indirectmanagednodeaudit calls _copy_table_pandas.metrics_utility/base/collector.py (1)
193-196: Redundant conditional that is always false.
last_gatheris set tohorizonon line 193, making the conditionlast_gather < horizonon line 194 always false. This appears to be leftover code from whenlast_gatherwas loaded from persistent storage.♻️ Suggested simplification
- last_gather = horizon - if last_gather < horizon: - last_gather = horizon - logger.warning(f'Last analytics run was more than {get_max_gather_period_days()} days prior to {until}, using {horizon} instead.') + last_gather = horizon🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@metrics_utility/base/collector.py` around lines 193 - 196, The code assigns last_gather = horizon and then checks if last_gather < horizon which can never be true; remove the redundant conditional and its logger.warning call (or, if intended, restore the original persistent-load logic instead of the direct assignment). Update the block around last_gather and horizon (variables last_gather, horizon and the logger.warning/get_max_gather_period_days() call) so it either only assigns last_gather = horizon or implements the persisted-load path correctly.metrics_utility/library/collectors/others/total_workers_vcpu.py (1)
114-123: Consider using a custom exception class instead of genericException.The
_getmethod raises genericExceptionfor both HTTP errors and API errors. This makes it harder for callers to distinguish error types and implement specific handling.Proposed improvement
+class PrometheusError(Exception): + """Base exception for Prometheus client errors.""" + pass + +class PrometheusHTTPError(PrometheusError): + """HTTP-level error from Prometheus.""" + pass + +class PrometheusAPIError(PrometheusError): + """API-level error from Prometheus.""" + pass + def _get(self, url, params): response = self.session.get(url, params=params, timeout=self.timeout) if response.status_code != 200: - raise Exception(f'HTTP error {response.status_code}: {response.text}') + raise PrometheusHTTPError(f'HTTP error {response.status_code}: {response.text}') data = response.json() if data.get('status') != 'success': - raise Exception(f'Prometheus API error: {data.get("error", "Unknown error")}') + raise PrometheusAPIError(f'Prometheus API error: {data.get("error", "Unknown error")}')🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@metrics_utility/library/collectors/others/total_workers_vcpu.py` around lines 114 - 123, The _get method currently raises generic Exception for HTTP and Prometheus API errors; add two specific exception classes (e.g., HttpRequestError and PrometheusApiError) in this module or a shared metrics exceptions module and raise them instead of Exception inside _get (replace the HTTP branch that checks response.status_code and the API-status branch that checks data.get('status') with raises of HttpRequestError and PrometheusApiError respectively, including status code and response.text or the API error message in the exception message) so callers can catch and handle these distinct error types.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@metrics_utility/base/utils.py`:
- Around line 29-40: The bool_from_env helper currently coerces any non-accepted
string to False; change bool_from_env to only accept '1' or 'true'
(case-insensitive) as True and '0' or 'false' as False, return default only when
the variable is unset, and when the variable is set to any other value raise a
ValueError that includes the env var name and the offending value so
typos/trailing whitespace are surfaced (locate and update the bool_from_env
function).
In `@metrics_utility/test/gather/test_gather_ranges.py`:
- Line 67: The warning message currently hardcodes "days" and produces "1 days"
when get_max_gather_period_days() returns 1; update the message construction in
the collector (where the warning at metrics_utility/base/collector.py is
built—refer to get_max_gather_period_days() and the function/method that logs
the warning) to pluralize correctly (use "day" when the value === 1, otherwise
"days"), and then update the test assertion in
metrics_utility/test/gather/test_gather_ranges.py (the assertion expecting the
string with "1 days") to expect "1 day" when the max period is 1.
In `@metrics_utility/test/library/test_collectors_total_workers_vcpu_helpers.py`:
- Around line 211-230: The test currently encodes and asserts double-counting
for multi-series Prometheus responses; update
test_get_cpu_timeline_multiple_series to reflect correct behavior of
get_cpu_timeline by expecting merged/normalized results instead of flat
concatenation — specifically, ensure the test asserts unique timestamps (e.g.,
len(result) == 2) and that values are combined/normalized per timestamp (or
otherwise assert timestamp uniqueness), so the helper is validated for merging
multi-series responses rather than allowing duplicate timeline entries.
---
Nitpick comments:
In `@metrics_utility/base/collector.py`:
- Around line 193-196: The code assigns last_gather = horizon and then checks if
last_gather < horizon which can never be true; remove the redundant conditional
and its logger.warning call (or, if intended, restore the original
persistent-load logic instead of the direct assignment). Update the block around
last_gather and horizon (variables last_gather, horizon and the
logger.warning/get_max_gather_period_days() call) so it either only assigns
last_gather = horizon or implements the persisted-load path correctly.
In `@metrics_utility/library/collectors/others/total_workers_vcpu.py`:
- Around line 114-123: The _get method currently raises generic Exception for
HTTP and Prometheus API errors; add two specific exception classes (e.g.,
HttpRequestError and PrometheusApiError) in this module or a shared metrics
exceptions module and raise them instead of Exception inside _get (replace the
HTTP branch that checks response.status_code and the API-status branch that
checks data.get('status') with raises of HttpRequestError and PrometheusApiError
respectively, including status code and response.text or the API error message
in the exception message) so callers can catch and handle these distinct error
types.
In `@metrics_utility/test/library/test_collectors_execution_environments.py`:
- Around line 21-22: Update the test docstring to reflect the current patched
function name: replace the outdated reference to "copy_table" with
"_copy_table_pandas" in the docstring of
test_execution_environments_calls_copy_table so the description matches the
mocked symbol used in the test.
- Line 46: The test sets mock_copy_pandas.return_value to an empty list which is
inconsistent with other tests that return pandas DataFrame objects; change the
mock to return an empty DataFrame (pd.DataFrame()) instead, ensuring the test
file imports pandas as pd if not already and update any assertions that assume a
DataFrame type to remain valid; target the mock named mock_copy_pandas in the
test_collectors_execution_environments.py test.
In
`@metrics_utility/test/library/test_collectors_main_indirectmanagednodeaudit.py`:
- Around line 26-27: The test's docstring is out-of-date: update the docstring
in test_main_indirectmanagednodeaudit_calls_copy_table to reflect the actual
patched function name (_copy_table_pandas) instead of "copy_table"; locate the
test by its name and the mock parameter mock_copy_pandas and change the
docstring text to accurately describe that the test verifies
main_indirectmanagednodeaudit calls _copy_table_pandas.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 84603860-0928-4299-b9a3-ed028c3219f6
📒 Files selected for processing (59)
metrics_utility/automation_controller_billing/collector.pymetrics_utility/automation_controller_billing/collectors.pymetrics_utility/automation_controller_billing/helpers.pymetrics_utility/automation_controller_billing/kubernetes_client.pymetrics_utility/automation_controller_billing/prometheus_client.pymetrics_utility/base/collection.pymetrics_utility/base/collection_data_status.pymetrics_utility/base/collection_manifest.pymetrics_utility/base/collector.pymetrics_utility/base/decorators.pymetrics_utility/base/package.pymetrics_utility/base/utils.pymetrics_utility/library/collectors/controller/config.pymetrics_utility/library/collectors/controller/controller_version_service.pymetrics_utility/library/collectors/controller/credentials_service.pymetrics_utility/library/collectors/controller/execution_environments.pymetrics_utility/library/collectors/controller/job_host_summary.pymetrics_utility/library/collectors/controller/job_host_summary_service.pymetrics_utility/library/collectors/controller/main_host.pymetrics_utility/library/collectors/controller/main_indirectmanagednodeaudit.pymetrics_utility/library/collectors/controller/main_jobevent.pymetrics_utility/library/collectors/controller/main_jobevent_service.pymetrics_utility/library/collectors/controller/table_metadata.pymetrics_utility/library/collectors/controller/unified_jobs.pymetrics_utility/library/collectors/others/prometheus_client.pymetrics_utility/library/collectors/others/total_workers_vcpu.pymetrics_utility/library/collectors/util.pymetrics_utility/library/csv_file_splitter.pymetrics_utility/logger.pymetrics_utility/management/validation.pymetrics_utility/test/base/functional/collector_module.pymetrics_utility/test/base/functional/collector_module2.pymetrics_utility/test/base/functional/collector_module3.pymetrics_utility/test/base/functional/collector_module4_slicing.pymetrics_utility/test/base/functional/helpers.pymetrics_utility/test/base/functional/test_slicing.pymetrics_utility/test/base/test_utils.pymetrics_utility/test/gather/test_gather_ranges.pymetrics_utility/test/gather/test_jobhostsummary_gather.pymetrics_utility/test/gather/test_total_workers_vcpu.pymetrics_utility/test/gather/test_total_workers_vcpu_token_handling.pymetrics_utility/test/library/test_collectors_execution_environments.pymetrics_utility/test/library/test_collectors_job_host_summary.pymetrics_utility/test/library/test_collectors_job_host_summary_service.pymetrics_utility/test/library/test_collectors_main_host.pymetrics_utility/test/library/test_collectors_main_indirectmanagednodeaudit.pymetrics_utility/test/library/test_collectors_main_jobevent.pymetrics_utility/test/library/test_collectors_main_jobevent_service.pymetrics_utility/test/library/test_collectors_prometheus_client.pymetrics_utility/test/library/test_collectors_total_workers_vcpu.pymetrics_utility/test/library/test_collectors_total_workers_vcpu_helpers.pymetrics_utility/test/library/test_collectors_unified_jobs.pymetrics_utility/test/library/test_collectors_util_helpers.pymetrics_utility/test/library/test_collectors_util_output.pymetrics_utility/test/test_automation_controller_billing_helpers.pymetrics_utility/test/test_collectors.pymetrics_utility/test/test_kubernetes_client.pymetrics_utility/test/test_prometheus_client.pyworkers/0-gather-controller-crc.py
💤 Files with no reviewable changes (7)
- metrics_utility/automation_controller_billing/prometheus_client.py
- metrics_utility/automation_controller_billing/kubernetes_client.py
- metrics_utility/library/collectors/others/prometheus_client.py
- metrics_utility/test/test_kubernetes_client.py
- metrics_utility/base/decorators.py
- metrics_utility/test/test_prometheus_client.py
- workers/0-gather-controller-crc.py
🚧 Files skipped from review as they are similar to previous changes (16)
- metrics_utility/base/collection_manifest.py
- metrics_utility/test/library/test_collectors_unified_jobs.py
- metrics_utility/test/base/test_utils.py
- metrics_utility/test/library/test_collectors_util_helpers.py
- metrics_utility/test/library/test_collectors_prometheus_client.py
- metrics_utility/library/collectors/controller/main_host.py
- metrics_utility/test/gather/test_total_workers_vcpu.py
- metrics_utility/test/gather/test_total_workers_vcpu_token_handling.py
- metrics_utility/test/library/test_collectors_job_host_summary.py
- metrics_utility/library/collectors/controller/unified_jobs.py
- metrics_utility/test/library/test_collectors_job_host_summary_service.py
- metrics_utility/automation_controller_billing/collector.py
- metrics_utility/test/library/test_collectors_util_output.py
- metrics_utility/library/collectors/controller/main_indirectmanagednodeaudit.py
- metrics_utility/library/collectors/controller/config.py
- metrics_utility/test/library/test_collectors_total_workers_vcpu.py
|
LGTM So much deleted code ! |



Issue: AAP-66077
Replaces #260
use metrics_utility.library.collectors from the cli - via cli_* wrappers in the original collector location
cli collectors get since, until, output params, passed to library, and used to specify collector output
exposed all exisiting collectors in the cli
limit_slicer becomes until_slicer, using until over now()
full_sync removed, unused
register description removed, unused
collector outputs - library collectors now get an output= param,
defaulting to either
DataframeOutput, whereoutput.sql(db, query)returns a pandas dataframe,or to
DictOutput, whereoutput.dict({...})returns a dict.for the CLI calls,
CollectorOutputgets passed instead, initialized with a temporary path for files,and outputs a filelist from output.sql, and a dict from output.dict.
this allows preserving the csv streaming for the cli, while keeping the
collector output a dataframe for anonymized
Summary by CodeRabbit
New Features
Refactor
Bug Fixes
Behavior change
Tests
Note
Medium Risk
Medium risk due to broad refactors across the collector execution path (CLI wrappers, output plumbing, slicing, and last-gather persistence) and changes to the vCPU/Prometheus auth flow, which could affect what data gets collected and how tarballs are partitioned.
Overview
Billing collection is refactored to make the CLI use
metrics_utility.library.collectorsvia newcli_*wrappers that accept(since, until, output), with a newCollectionOutputadapter to keep CSV streaming for tarball generation while allowing library collectors to default to pandas (DataframeOutput) or dict (DictOutput) outputs.The legacy SQL/copy logic in
automation_controller_billing/collectors.pyis removed and replaced by thin wrappers around library collectors, slicing is adjusted (limit_slicing→until_slicingstoring atuntil-1s), and “full sync” support plus unused@registermetadata (descriptions/shipping groups) are dropped.Environment parsing is centralized with
bool_from_env,Collector/Collectiongathering APIs are simplified (nomax_data_sizeargument), andtotal_workers_vcpuis reworked to use the library implementation with explicit Kubernetes token/CA handling and updated timestamp formatting; associated validation and tests are updated/added accordingly.Written by Cursor Bugbot for commit 2315891. This will update automatically on new commits. Configure here.