Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Ensure we use structured logging when returning disruption commands #1998

Conversation

jonathan-innis
Copy link
Member

@jonathan-innis jonathan-innis commented Feb 16, 2025

Fixes #N/A

Description

This change moves all of the structured data that was previously in the disruption message into actual structured fields. This should make it easier to parse data when using logging solutions that automatically parse-out JSON keys

Before Change

{"level":"INFO","time":"2025-02-17T02:12:16.143Z","logger":"controller","message":"disrupting nodeclaim(s) via delete, terminating 4 nodes (0 pods) ip-192-168-162-8.us-west-2.compute.internal/t3.small/spot, ip-192-168-81-178.us-west-2.compute.internal/t3.small/spot, ip-192-168-190-173.us-west-2.compute.internal/t3.small/spot, ip-192-168-87-10.us-west-2.compute.internal/t3.small/spot","commit":"058c665","controller":"disruption","namespace":"","name":"","reconcileID":"7a30809c-4b08-4c26-b62d-d2528fe61e46","command-id":"15f90d03-1b0a-4eaf-b194-756fa6f03d84","reason":"empty"}

After Change

# Disrupting 3 nodes at once
{"level":"INFO","time":"2025-02-17T17:34:17.498Z","logger":"controller","caller":"disruption/controller.go:193","message":"disrupting node(s)","commit":"a79abac","controller":"disruption","namespace":"","name":"","reconcileID":"bd34251d-94c6-46d4-b816-6e807f275d2b","command-id":"5d70ee21-9a00-4e73-b2bd-1829ff87f3c9","reason":"empty","decision":"delete","candidate-count":3,"replacement-count":0,"pod-count":0,"candidate-nodes":[{"Node":{"name":"ip-192-168-118-164.us-west-2.compute.internal"},"NodeClaim":{"name":"default-v9skf"},"capacity-type":"spot","instance-type":"t3.small"},{"Node":{"name":"ip-192-168-11-190.us-west-2.compute.internal"},"NodeClaim":{"name":"default-n8wkq"},"capacity-type":"spot","instance-type":"t3.small"},{"Node":{"name":"ip-192-168-31-254.us-west-2.compute.internal"},"NodeClaim":{"name":"default-hjwnr"},"capacity-type":"spot","instance-type":"t2.medium"}],"replacement-nodes":[]}
{"level":"DEBUG","time":"2025-02-17T17:34:18.009Z","logger":"controller","caller":"singleton/controller.go:26","message":"command succeeded","commit":"a79abac","controller":"disruption.queue","namespace":"","name":"","reconcileID":"9f732d66-21be-41a6-98a4-73ae1b14567a","command-id":"5d70ee21-9a00-4e73-b2bd-1829ff87f3c9","reason":"Empty","decision":"delete","candidate-count":3,"replacement-count":0,"candidate-nodes":[{"Node":{"name":"ip-192-168-118-164.us-west-2.compute.internal"},"NodeClaim":{"name":"default-v9skf"}},{"Node":{"name":"ip-192-168-11-190.us-west-2.compute.internal"},"NodeClaim":{"name":"default-n8wkq"}},{"Node":{"name":"ip-192-168-31-254.us-west-2.compute.internal"},"NodeClaim":{"name":"default-hjwnr"}}],"replacement-nodes":[]}


# Disrupting 2 nodes at once
{"level":"INFO","time":"2025-02-17T17:34:32.608Z","logger":"controller","caller":"disruption/controller.go:193","message":"disrupting node(s)","commit":"a79abac","controller":"disruption","namespace":"","name":"","reconcileID":"13f1d3f7-a7c3-4a1b-af29-4d857e29f706","command-id":"1135755a-5d5e-467d-8322-cb5f2375a148","reason":"empty","decision":"delete","candidate-count":2,"replacement-count":0,"pod-count":0,"candidate-nodes":[{"Node":{"name":"ip-192-168-125-67.us-west-2.compute.internal"},"NodeClaim":{"name":"default-22hqv"},"capacity-type":"spot","instance-type":"t3.small"},{"Node":{"name":"ip-192-168-106-221.us-west-2.compute.internal"},"NodeClaim":{"name":"default-pwzrz"},"capacity-type":"spot","instance-type":"t2.medium"}],"replacement-nodes":[]}
{"level":"DEBUG","time":"2025-02-17T17:34:33.048Z","logger":"controller","caller":"singleton/controller.go:26","message":"command succeeded","commit":"a79abac","controller":"disruption.queue","namespace":"","name":"","reconcileID":"781922b1-44eb-4e05-a790-a1cff2dd5481","command-id":"1135755a-5d5e-467d-8322-cb5f2375a148","reason":"Empty","decision":"delete","candidate-count":2,"replacement-count":0,"candidate-nodes":[{"Node":{"name":"ip-192-168-125-67.us-west-2.compute.internal"},"NodeClaim":{"name":"default-22hqv"}},{"Node":{"name":"ip-192-168-106-221.us-west-2.compute.internal"},"NodeClaim":{"name":"default-pwzrz"}}],"replacement-nodes":[]}

How was this change tested?

make presubmit

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 16, 2025
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 16, 2025
@coveralls
Copy link

coveralls commented Feb 16, 2025

Pull Request Test Coverage Report for Build 13442769211

Details

  • 60 of 68 (88.24%) changed or added relevant lines in 13 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.09%) to 81.467%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/controllers/disruption/multinodeconsolidation.go 1 2 50.0%
pkg/controllers/nodeclaim/hydration/controller.go 1 3 33.33%
pkg/controllers/provisioning/provisioner.go 5 7 71.43%
pkg/scheduling/volumeusage.go 0 3 0.0%
Totals Coverage Status
Change from base Build 13442558096: 0.09%
Covered Lines: 9262
Relevant Lines: 11369

💛 - Coveralls

@jonathan-innis jonathan-innis force-pushed the structured-disruption-logging branch 4 times, most recently from 2542084 to 84d0971 Compare February 17, 2025 02:40
@jonathan-innis jonathan-innis marked this pull request as ready for review February 17, 2025 17:37
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 17, 2025
@jonathan-innis jonathan-innis force-pushed the structured-disruption-logging branch from 84d0971 to a579ac3 Compare February 17, 2025 17:40
@jonathan-innis jonathan-innis force-pushed the structured-disruption-logging branch from a579ac3 to 7a90cc7 Compare February 20, 2025 18:53
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 20, 2025
@jonathan-innis jonathan-innis force-pushed the structured-disruption-logging branch from 7a90cc7 to b6860c0 Compare February 20, 2025 18:59
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 20, 2025
@jonathan-innis jonathan-innis force-pushed the structured-disruption-logging branch from b6860c0 to 2be6773 Compare February 20, 2025 19:03
@jonathan-innis jonathan-innis force-pushed the structured-disruption-logging branch from 2be6773 to a478230 Compare February 20, 2025 19:05
Copy link
Contributor

@rschalo rschalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jonathan-innis, rschalo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 20, 2025
@k8s-ci-robot k8s-ci-robot merged commit c0e7299 into kubernetes-sigs:main Feb 20, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants