Skip to content

AAP-54476 : library.collectors - move collectors to library#248

Merged
himdel merged 4 commits intoansible:develfrom
himdel:collections
Nov 7, 2025
Merged

AAP-54476 : library.collectors - move collectors to library#248
himdel merged 4 commits intoansible:develfrom
himdel:collections

Conversation

@himdel
Copy link
Copy Markdown
Contributor

@himdel himdel commented Oct 29, 2025

Issue: AAP-54476

Move all collectors into metrics_utility.library.collectors. (Technically, copy, removing the originals is a second PR.)

A collector is a function decorated with @collector, it either returns a dict, or a list of csv filenames (from copy_table),
it always takes a db= parameter if it's supposed to talk to a db, or credentials for other services,
it never reads environment variables directly, always using params,
it takes since= and until= timestamp-with-timezone params if the collector supports collecting within time contraints, since is included (>=), until excluded (<),
it only takes args it uses, no extra **kwargs.

The @collector decorator wraps these into a class instance such that collector_fn(params).gather() actually calls the original collector_fn with params. This is to allow splitting collector config (which should happen outside of global tempdir handling) from actual gather (which should happen while holding a db lock).

Notes:

  • output_dir= is needed for compatibility with the cli, but omitting it will work too (and create a new tempdir per collect)
  • output_file= (unused but supported in copy_table) allows streaming the data without creating tempfiles - just writes to a filehandle object
  • no cleanup happens for the collected files (except for the cleanup done by cli) - this will be the role of the library tempfile helper (wip, separate issue)

Controller collectors (metrics_utility.library.collectors.controller.*)

  • config(db, billing_provider_params).gather() -> Dict
  • execution_environments(db, [output_dir]).gather() -> [filenames]
  • job_host_summary(db, since, until, [output_dir]).gather() -> [filenames]
  • job_host_summary_service(db, since, until, [output_dir]).gather() -> [filenames]
  • main_host(db, [output_dir]).gather() -> [filenames]
  • main_indirectmanagednodeaudit(db, since, until, [output_dir]).gather() -> [filenames]
  • main_jobevent(db, since, until, [output_dir]).gather() -> [filenames]
  • main_jobevent_service(db, since, until, [output_dir]).gather() -> [filenames]
  • unified_jobs(db, since, until, [output_dir]).gather() -> [filenames]

Other collectors (metrics_utility.library.collectors.others.*)

  • total_workers_vcpu(cluster_name, metering_enabled, prometheus_url, ca_cert_path, token) -> Dict

Separate PR (#260):

The original collectors file retains all the collector functions, original decorators, any get_optional_collectors logic,
but otherwise the functions end up calling the library functions.

Later:

  • move from f-strings to named query params - copy_table can handle it now; drop psycopg2, use safe sql
  • why the extra backslash replace on event_data?

@himdel himdel changed the title [WIP] library.collections - move collectors to library, keep wired in cli [WIP] library.collectors - move collectors to library, keep wired in cli Oct 31, 2025
@himdel himdel force-pushed the collections branch 3 times, most recently from 0ea9ad0 to 5d3bdbf Compare November 6, 2025 16:57
@himdel himdel changed the title [WIP] library.collectors - move collectors to library, keep wired in cli library.collectors - move collectors to library Nov 6, 2025
@himdel himdel force-pushed the collections branch 4 times, most recently from 41cae5c to 9e56102 Compare November 6, 2025 17:56
@himdel himdel marked this pull request as ready for review November 6, 2025 17:57
* make sure none of them read environment variables anymore
* make sure there are no assumptions about files being present somewhere in the filesystem
* db= is always a psycopg connection handle (v2 or v3)

copy_table now supports output_dir or output_file, and optional params= for query params

Issue: AAP-54476
use pytest.approx for float comparisons in tests
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Nov 7, 2025

Copy link
Copy Markdown
Contributor

@cshiels-ie cshiels-ie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@himdel himdel merged commit cb36f4e into ansible:devel Nov 7, 2025
2 checks passed
@himdel himdel deleted the collections branch November 7, 2025 16:47
MilanPospisil pushed a commit to MilanPospisil/metrics-utility that referenced this pull request Jan 6, 2026
* library.collectors: move collectors to library

* make sure none of them read environment variables anymore
* make sure there are no assumptions about files being present somewhere in the filesystem
* db= is always a psycopg connection handle (v2 or v3)

copy_table now supports output_dir or output_file, and optional params= for query params

Issue: AAP-54476

* move CsvFileSplitter, to stop sonarqube complaining about duplicate code

* claude tests

use pytest.approx for float comparisons in tests

* Split library.collectors into library.collectors.controller & library.collectors.others
@SherinV SherinV changed the title library.collectors - move collectors to library AAP-54476 : library.collectors - move collectors to library Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants