Skip to content

Adding collector providing the total of vCPU of all workers#165

Merged
MilanPospisil merged 74 commits intoansible:develfrom
itdove:vcpu_collector
Aug 15, 2025
Merged

Adding collector providing the total of vCPU of all workers#165
MilanPospisil merged 74 commits intoansible:develfrom
itdove:vcpu_collector

Conversation

@itdove
Copy link
Copy Markdown
Contributor

@itdove itdove commented Jul 10, 2025

AAP-48831

Description

  • What is being changed? Add a collector for to send the total number of CPU of all workers
  • Why is this change needed? To bill per CPU usage rather than flat rate
  • How does this change address the issue? Adding a collector

Testing

Prerequisites

Steps to Test

  1. Deploy an mgt with my own image and the sre branch
  2. verify the METRICS_UTILITY_USAGE_BASED_BILLING_ENABLED is false in the mgt config map
  3. Deploy a customer instance with my own sre and model branch
  4. verify the METRICS_UTILITY_DISABLE_JOB_HOST_SUMMARY_COLLECTOR is true in the new cronjob env and false metric config map read by the existing cronjob.
  5. verify the METRICS_UTILITY_USAGE_BASED_BILLING_ENABLED is false in the metric config map
  6. Load my own metrics_utility image in the new cronjob.
  7. Run the cmd python /metrics_utility/manage.py gather_automation_controller_billing_data --dry-run
  8. Check the generated file and it contains vcpu = 1 which is correct.
  9. Update the config.env in customer github to set METRICS_UTILITY_USAGE_BASED_BILLING_ENABLED to true
  10. (migrate customer from old billing to usage billing scenario)
  11. Rerun python /metrics_utility/manage.py gather_automation_controller_billing_data --dry-run
  12. Check the generated file and it contains vcpu = 12 which is correct.
  13. The calculation details is displayed
  14. Delete the customer instance
  15. Change the config.env of the mgt by setting METRICS_UTILITY_USAGE_BASED_BILLING_ENABLED to true, configmap get updated and mgt restarted.
  16. Deploy a new customer instance with my own sre and model branch
  17. Check the metric configmap and METRICS_UTILITY_USAGE_BASED_BILLING_ENABLED is true and it is correct as the same env var is true in the mgt.
  18. Load my own metrics_utility image in the new cronjob.
  19. Run the cmd python /metrics_utility/manage.py gather_automation_controller_billing_data --dry-run
  20. Check the generated file and it contains vcpu = 12 which is correct.
  21. The calculation details is displayed
  22. Delete the customer instance

Expected Results

Required Actions

  • Requires documentation updates
  • Requires downstream repository changes
  • Requires infrastructure/deployment changes
  • Requires coordination with other teams
  • Blocked by PR/MR: #XXX

Self-Review Checklist

Code Quality

  • Code is properly linted and formatted.
  • All tests (existing and new) pass successfully.
  • Tests are added or updated as needed.
  • Code includes relevant comments for complex sections.
  • Changes are reviewed and approved by at least two team members.
  • Documentation is updated where applicable.
    - If updates are required in the handbook, create a separate PR for the handbook repo.

Notes for Reviewers

Add any additional context or instructions for reviewers here - for example screenshots if needed.

Comment thread pyproject.toml
Comment thread metrics_utility/management/validation.py Outdated
Comment thread metrics_utility/automation_controller_billing/collectors.py Outdated
Comment thread metrics_utility/automation_controller_billing/collectors.py Outdated
Comment thread metrics_utility/automation_controller_billing/collectors.py Outdated
Comment thread metrics_utility/automation_controller_billing/collectors.py
Comment thread metrics_utility/automation_controller_billing/collectors.py Outdated
Comment thread metrics_utility/management/validation.py Outdated
@itdove itdove marked this pull request as ready for review August 5, 2025 15:24
Comment thread metrics_utility/automation_controller_billing/collector.py
Comment thread metrics_utility/automation_controller_billing/collectors.py
@itdove itdove requested a review from MilanPospisil August 6, 2025 12:35
Comment thread metrics_utility/automation_controller_billing/collectors.py Outdated
Comment thread metrics_utility/base/README.md Outdated
Comment thread metrics_utility/test/gather/test_gather_ranges.py
Comment thread metrics_utility/automation_controller_billing/collectors.py Outdated
@itdove itdove requested a review from robinbobbitt August 6, 2025 15:52
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Aug 7, 2025

himdel added a commit to himdel/metrics-utility that referenced this pull request Aug 7, 2025
Copy link
Copy Markdown
Contributor

@MilanPospisil MilanPospisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Do not merge it yet, wait for code freeze. This is not needed to be merged ASAP into the new release.

@MilanPospisil MilanPospisil merged commit 429816e into ansible:devel Aug 15, 2025
2 checks passed
MilanPospisil pushed a commit to MilanPospisil/metrics-utility that referenced this pull request Jan 6, 2026
)

* Add cls to json type

* Remove the classes to simplify code

* Add environment variable METRIC_UTILITY_ANSIBLE_SAAS_CLUSTER_NAME and METRIC_UTILITY_VCPU_COUNT_OVERWRITE

* ruff fix

* fix init timestamp

* Add get_mandatory_collectors

* Reformat collectors.py

* Use METRIC_UTILITY_VCPU_COUNT_ENABLED instead of METRIC_UTILITY_VCPU_COUNT_OVERWRITE

* Test

* Add clean METRICS_UTILITY_MANDATORY_COLLECTORS env var in test

* Add pytest

* fix ruff

* Reformat

* Log error instead of raising exception

* Fix tests ruff

* Raise exception on error

* Reformat

* ruff

* Add MANDATORY COLLECTORS validity

* Fix pytest

* Fix mock config

* Use metrics_utility.exceptions

* Rename METRICS_UTILITY_VCPU_COUNT_ENABLED to METRICS_UTILITY_USAGE_BASED_BILLING_ENABLED

* Update pytest

* Run ruff

* Rename METRICS_UTILITY_ANSIBLE_SAAS_CLUSTER_NAME to METRICS_UTILITY_CLUSTER_NAME

* Use logging

* comments

* Create a new logger

* Format

* Fix precheck

* with quote in precheck

* Fix test

* Use METRICS_UTILITY_DISABLE_JOB_HOST_SUMMARY_COLLECTOR

* reformat

* Remove METRICS_UTILITY_MANDATORY_COLLECTORS for validation

* Add doc in test_Containerfile

* add test for METRICS_UTILITY_DISABLE_JOB_HOST_SUMMARY_COLLECTOR

* Remove unused local varaible

* Add test if failed to get a kube client

* Fix imports

* Remove var VALID_MANDATORY_COLLECTORS

* Rename VALID_OPTIONAL_COLLECTORS to its original name

* ruff format

* safe_tarfile_member_check

* format

* Fix sonar

* fix sonarQube

* Add METRICS_UTILITY_COLLECTOR_LOCK_SUFFIX and METRICS_UTILITY_DISABLE_SAVE_LAST_GATHERED_ENTRIES

* REmove comments

* Deleting test_total_workers_vcpu_standalone.py... not needed

* Add limit_slicing

* Parenthesis

* Add macos instruction

* sort import

* Delete file at the end of each test

* Try again

* Comment assert

* revert

* comment tests

* Print

* Catch extra exception

* Fix exception

* Remove exception

* format

* fix assert ruff

---------

Signed-off-by: itdove <dvernier@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants