Skip to content

Implement polling-based BMC metrics collection#837

Open
stefanhipfel wants to merge 5 commits into
mainfrom
worktree-bmc-metrics
Open

Implement polling-based BMC metrics collection#837
stefanhipfel wants to merge 5 commits into
mainfrom
worktree-bmc-metrics

Conversation

@stefanhipfel

@stefanhipfel stefanhipfel commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

Closes #813

Summary by CodeRabbit

  • New Features

    • Added periodic polling to retrieve metrics and event logs from BMCs; configurable polling interval flag added.
  • Bug Fixes

    • Fixed JSON structure in mock server data.
  • Refactor

    • Replaced subscription-based event delivery with direct polling.
    • Converted metrics collector to a shared singleton for efficiency.
  • Tests

    • Added tests covering metric and event retrieval behavior.
  • Chores

    • Removed obsolete subscription-link status fields and deprecated HTTP event endpoints.
  • Documentation

    • Updated API docs to reflect removed status fields.

Review Change Stack

@stefanhipfel stefanhipfel requested a review from a team as a code owner April 24, 2026 23:21
@coderabbitai

coderabbitai Bot commented Apr 24, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fa462579-32c9-4dba-bcd9-538ec284c2b3

📥 Commits

Reviewing files that changed from the base of the PR and between 4e30b43 and 9a81d7e.

⛔ Files ignored due to path filters (1)
  • dist/chart/templates/crd/metal.ironcore.dev_bmcs.yaml is excluded by !**/dist/**
📒 Files selected for processing (16)
  • api/v1alpha1/applyconfiguration/api/v1alpha1/bmcstatus.go
  • api/v1alpha1/bmc_types.go
  • bmc/bmc.go
  • bmc/mock/server/data/index.json
  • bmc/redfish.go
  • bmc/redfish_metrics_test.go
  • cmd/main.go
  • config/crd/bases/metal.ironcore.dev_bmcs.yaml
  • docs/api-reference/api.md
  • internal/controller/bmc_controller.go
  • internal/controller/bmc_controller_test.go
  • internal/controller/suite_test.go
  • internal/serverevents/metrics.go
  • internal/serverevents/server.go
  • internal/serverevents/subscription.go
  • test/serverevents/main.go
💤 Files with no reviewable changes (9)
  • api/v1alpha1/bmc_types.go
  • api/v1alpha1/applyconfiguration/api/v1alpha1/bmcstatus.go
  • test/serverevents/main.go
  • internal/controller/bmc_controller.go
  • internal/serverevents/subscription.go
  • internal/controller/suite_test.go
  • docs/api-reference/api.md
  • config/crd/bases/metal.ironcore.dev_bmcs.yaml
  • internal/controller/bmc_controller_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • internal/serverevents/metrics.go
  • bmc/mock/server/data/index.json
  • bmc/redfish.go
  • bmc/bmc.go
  • bmc/redfish_metrics_test.go
  • internal/serverevents/server.go
  • cmd/main.go

📝 Walkthrough

Walkthrough

This PR implements polling-based BMC monitoring in place of subscription-based delivery: it removes subscription fields and helpers, adds BMC GetMetricReport/GetEventLog methods and types with Redfish implementations, refactors serverevents into a ticker-driven polling server with a shared Prometheus collector, and updates CLI, CRD, controller, tests, and docs.

Changes

Polling-Based Monitoring Transition

Layer / File(s) Summary
Data Shape / ApplyConfig
api/v1alpha1/bmc_types.go, api/v1alpha1/applyconfiguration/api/v1alpha1/bmcstatus.go
Remove metricsReportSubscriptionLink and eventsSubscriptionLink from BMCStatus and generated apply-configuration; remove related builder methods.
BMC Interface & Types
bmc/bmc.go
Add GetMetricReport() and GetEventLog() to BMC interface; introduce MetricsReport, MetricValue, and Event types.
Redfish Implementation
bmc/redfish.go
Implement GetMetricReport() to prefer TelemetryService and map metric entries; implement GetEventLog() to enumerate bootable Systems → LogServices → Entries and aggregate into Event records (10-minute cutoff when parse succeeds).
Mock & Tests
bmc/mock/server/data/index.json, bmc/redfish_metrics_test.go
Fix mock JSON trailing brace; add Ginkgo tests for GetMetricReport (TelemetryService present/absent) and GetEventLog (log services present/no systems).
CLI Wiring
cmd/main.go
Add --metrics-polling-interval flag (duration, default 120s; 0 disables); remove --event-url, --event-port, --event-protocol; conditionally start polling server runnable when interval > 0 with protocol/TLS/basic-auth options.
CRD & API Docs
config/crd/bases/metal.ironcore.dev_bmcs.yaml, docs/api-reference/api.md
Remove status.eventsSubscriptionLink and status.metricsReportSubscriptionLink from CRD status; add type to required fields in status.conditions[]; remove subscription fields from API docs and adjust status field ordering/enum.
Controller Cleanup
internal/controller/bmc_controller.go
Remove EventURL from BMCReconciler; delete reconcile/create and delete/cleanup subscription steps and related helper functions/imports.
Controller Tests & Suite
internal/controller/bmc_controller_test.go, internal/controller/suite_test.go
Remove assertions for Status.MetricsReportSubscriptionLink and Status.EventsSubscriptionLink; stop setting EventURL in test setup.
Prometheus Collector
internal/serverevents/metrics.go
Change collector to a package-level singleton initialized once via sync.Once; add GetCollector() export; retain update/collection behavior.
Polling Server Core
internal/serverevents/server.go
Replace HTTP /serverevents/ endpoints with a polling Server: add ServerConfig, refactor NewServer() to accept polling config, implement Start() as ticker-driven polling loop, maintain collector and per-BMC polling options.
Polling Orchestration
internal/serverevents/server.go
Add pollAllBMCs with bounded concurrency and pollBMC to fetch metrics/events, convert to server JSON structs, and update the shared collector only on non-empty results; per-BMC errors logged but do not abort cycles.
Subscription Removal
internal/serverevents/subscription.go, test/serverevents/main.go
Delete subscription helpers (SubscribeMetricsReport, SubscribeEvents) and the standalone test main used for the HTTP event server.

Sequence Diagram(s)

sequenceDiagram
  participant Manager
  participant PollingServer
  participant KubeClient
  participant BMC
  participant Collector
  Manager->>PollingServer: Start(ctx)
  PollingServer->>PollingServer: initialize ticker
  loop Every interval
    PollingServer->>KubeClient: List BMC CRs
    KubeClient-->>PollingServer: BMC list
    par Bounded concurrent polls
      PollingServer->>BMC: GetMetricReport()
      BMC-->>PollingServer: MetricsReport
      PollingServer->>Collector: UpdateFromMetricsReport()
      PollingServer->>BMC: GetEventLog()
      BMC-->>PollingServer: []Event
      PollingServer->>Collector: UpdateFromEvent()
    end
  end
  Manager->>PollingServer: ctx.Done()
  PollingServer-->>Manager: Stop
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

enhancement, highlight

Suggested reviewers

  • afritzler
  • asergeant01
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The pull request description is minimal ('Closes #813'), missing proposed changes, explanation of the implementation, and key architectural details required by the template. Expand the description to follow the template: add 'Proposed Changes' section listing key changes (polling server, metrics/event retrieval, status field removal), explain the motivation, and reference issue #813 context.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Implement polling-based BMC metrics collection' accurately describes the main change: replacing event subscription-based monitoring with a polling mechanism for BMC metrics.
Linked Issues check ✅ Passed The PR fully implements the primary objective from #813: polling-based BMC metrics collection with embedded deployment, metrics/event retrieval, session pooling support via gofish client, vendor-aware Redfish handling, memory caching via Prometheus collector, and comprehensive tests.
Out of Scope Changes check ✅ Passed All code changes are within scope: removal of event subscription infrastructure, addition of polling mechanisms, metrics/event collection interfaces, Prometheus collector refactoring, and CRD/API updates directly align with #813 objectives.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch worktree-bmc-metrics

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@maxmoehl maxmoehl left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the change looks good to me. Please also address the failing checks.

Comment thread bmc/redfish.go Outdated
return nil, fmt.Errorf("no systems found")
}

system := systems[0]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we do support multiple systems, but with #825 we are filtering out non-bootable systems. This should be extended to include all bootable systems, maybe a shared helper makes sense to not duplicate the condition for a valid system if we extend that in the future.

/cc @afritzler

Comment thread internal/serverevents/server.go Outdated
Comment thread internal/serverevents/server.go Outdated
for {
select {
case <-ticker.C:
s.pollAllBMCs(ctx)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider limiting for how long the poll runs, if it is slow and the interval is low we might be overwhelming BMCs. So something like this:

Suggested change
s.pollAllBMCs(ctx)
s.pollAllBMCs(context.WithTimeout(ctx, s.interval))

But you will have to handle the cancel function properly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@stefanhipfel stefanhipfel force-pushed the worktree-bmc-metrics branch from f237cc1 to 4e30b43 Compare May 11, 2026 07:34

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
bmc/redfish_metrics_test.go (1)

1-322: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix formatting issue flagged by pipeline.

The pipeline failure indicates uncommitted formatting changes detected after build. The spacing/alignment of @odata.id fields in the test's JSON mock responses differs from the expected format.

Run make lint-fix to apply the correct formatting before committing.

As per coding guidelines: "Run make lint-fix and make test after editing Go source files"

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bmc/redfish_metrics_test.go` around lines 1 - 322, The pipeline is failing
due to formatting differences in the test JSON responses (the "@odata.id"
keys/spacing) in the Redfish tests; open the test file and normalize formatting
for the JSON map literals (e.g., the map entries in the handlers that include
"@odata.id" such as in the handlers for "/redfish/v1/",
"/redfish/v1/TelemetryService", "/redfish/v1/TelemetryService/MetricReports",
"/redfish/v1/TelemetryService/MetricReports/Report1", and the various
"/redfish/v1/Systems..." handlers), then run the repository formatter/linter fix
command (make lint-fix) and re-run tests (make test) before committing to ensure
the formatting changes are applied and the pipeline passes.
🧹 Nitpick comments (2)
bmc/redfish.go (1)

1236-1237: 💤 Low value

Consider making the event log time window configurable.

The 10-minute cutoff is hardcoded. Depending on the polling interval (configurable via --metrics-polling-interval, default 120s), a 10-minute window may collect many duplicate events or miss recent events if the interval is much longer. Consider making this configurable or deriving it from the polling interval (e.g., 2 * pollingInterval).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bmc/redfish.go` around lines 1236 - 1237, The hardcoded 10-minute cutoff
(cutoff := time.Now().Add(-10 * time.Minute)) should be made configurable or
derived from the polling interval; update the logic in bmc/redfish.go that
computes cutoff to use a configurable duration (e.g., add an EventLogWindow
setting) or compute window := 2 * metricsPollingInterval and then cutoff :=
time.Now().Add(-window). Modify the code paths that call this cutoff computation
(look for the cutoff variable and the function/method that filters recent
entries) to read the new config or polling interval value and validate a sane
minimum and maximum.
internal/serverevents/server.go (1)

127-149: ⚡ Quick win

Add context cancellation checks in polling loop.

When the context is cancelled (e.g., during shutdown), the spawned goroutines will continue running until pollBMC completes. This could delay shutdown if BMC operations are slow. Consider checking ctx.Done() before spawning each goroutine or passing the context check into the semaphore acquisition.

🛑 Proposed fix to respect context cancellation
 	for i := range bmcList.Items {
 		bmcObj := &bmcList.Items[i]
 
 		if !bmcObj.DeletionTimestamp.IsZero() {
 			continue
 		}
+		
+		// Check if context is cancelled before spawning more goroutines
+		select {
+		case <-ctx.Done():
+			break
+		default:
+		}
 
 		wg.Add(1)
 		go func(bmc *metalv1alpha1.BMC) {
 			defer wg.Done()
 
 			semaphore <- struct{}{}
 			defer func() { <-semaphore }()
 
 			s.pollBMC(ctx, bmc)
 		}(bmcObj)
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/serverevents/server.go` around lines 127 - 149, The polling loop
should respect context cancellation: before scheduling a goroutine for bmcObj
check ctx.Done()/ctx.Err() and skip starting the worker if cancelled, and inside
the goroutine use a cancellable semaphore acquire (select on sending to
semaphore vs <-ctx.Done()) so you can return early instead of blocking; keep
using wg to track started goroutines and ensure you call wg.Done() on early
exit. Specifically update the loop around wg, semaphore and the anonymous
goroutine that calls s.pollBMC(ctx, bmc) to perform the pre-spawn context check
and make semaphore acquisition context-aware so shutdown can proceed promptly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@bmc/redfish.go`:
- Around line 1241-1243: The code silently ignores errors from time.Parse when
converting entry.Created to entryTime (entryTime, _ = time.Parse(time.RFC3339,
entry.Created)), which lets malformed timestamps bypass the 10-minute filter;
update the parsing logic in the same function to check and handle the parse
error: call time.Parse and capture the error, and on error either log the
failure (using the existing logger) and skip the entry or treat the entry as
invalid by continuing to the next record; ensure you still use
entryTime.IsZero() only for genuinely empty timestamps and not for parse
failures so the 10-minute filter behaves correctly.
- Around line 1176-1187: Update the loop that builds your local MetricValue
slice from report.MetricValues to extract structured fields instead of
stringifying: for each entry in report.MetricValues, attempt to read the metric
identifier, property path, and value (e.g., use report.MetricId / MetricID,
report.MetricProperty, and report.MetricValue or the corresponding keys on the
reported element) and populate MetricID, MetricProperty and MetricValue with
those extracted values (fall back to the current fmt.Sprintf behavior only if
the typed fields are absent); modify the code inside the existing for i := 0; i
< len(report.MetricValues); i++ loop (and the MetricValue struct population) to
prefer the real fields from the gofish MetricReport element instead of hardcoded
"Metric%d", report.ODataID, and fmt.Sprintf("%v", ...).

---

Outside diff comments:
In `@bmc/redfish_metrics_test.go`:
- Around line 1-322: The pipeline is failing due to formatting differences in
the test JSON responses (the "@odata.id" keys/spacing) in the Redfish tests;
open the test file and normalize formatting for the JSON map literals (e.g., the
map entries in the handlers that include "@odata.id" such as in the handlers for
"/redfish/v1/", "/redfish/v1/TelemetryService",
"/redfish/v1/TelemetryService/MetricReports",
"/redfish/v1/TelemetryService/MetricReports/Report1", and the various
"/redfish/v1/Systems..." handlers), then run the repository formatter/linter fix
command (make lint-fix) and re-run tests (make test) before committing to ensure
the formatting changes are applied and the pipeline passes.

---

Nitpick comments:
In `@bmc/redfish.go`:
- Around line 1236-1237: The hardcoded 10-minute cutoff (cutoff :=
time.Now().Add(-10 * time.Minute)) should be made configurable or derived from
the polling interval; update the logic in bmc/redfish.go that computes cutoff to
use a configurable duration (e.g., add an EventLogWindow setting) or compute
window := 2 * metricsPollingInterval and then cutoff := time.Now().Add(-window).
Modify the code paths that call this cutoff computation (look for the cutoff
variable and the function/method that filters recent entries) to read the new
config or polling interval value and validate a sane minimum and maximum.

In `@internal/serverevents/server.go`:
- Around line 127-149: The polling loop should respect context cancellation:
before scheduling a goroutine for bmcObj check ctx.Done()/ctx.Err() and skip
starting the worker if cancelled, and inside the goroutine use a cancellable
semaphore acquire (select on sending to semaphore vs <-ctx.Done()) so you can
return early instead of blocking; keep using wg to track started goroutines and
ensure you call wg.Done() on early exit. Specifically update the loop around wg,
semaphore and the anonymous goroutine that calls s.pollBMC(ctx, bmc) to perform
the pre-spawn context check and make semaphore acquisition context-aware so
shutdown can proceed promptly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9e4c9647-59cc-4742-92a4-80dd82bcd3c0

📥 Commits

Reviewing files that changed from the base of the PR and between bd30487 and 4e30b43.

📒 Files selected for processing (14)
  • api/v1alpha1/bmc_types.go
  • bmc/bmc.go
  • bmc/mock/server/data/index.json
  • bmc/redfish.go
  • bmc/redfish_metrics_test.go
  • cmd/main.go
  • config/crd/bases/metal.ironcore.dev_bmcs.yaml
  • internal/controller/bmc_controller.go
  • internal/controller/bmc_controller_test.go
  • internal/controller/suite_test.go
  • internal/serverevents/metrics.go
  • internal/serverevents/server.go
  • internal/serverevents/subscription.go
  • test/serverevents/main.go
💤 Files with no reviewable changes (7)
  • internal/controller/suite_test.go
  • api/v1alpha1/bmc_types.go
  • test/serverevents/main.go
  • internal/serverevents/subscription.go
  • config/crd/bases/metal.ironcore.dev_bmcs.yaml
  • internal/controller/bmc_controller_test.go
  • internal/controller/bmc_controller.go

Comment thread bmc/redfish.go
Comment thread bmc/redfish.go Outdated
stefanhipfel added a commit that referenced this pull request May 11, 2026
- Implement multi-system support in GetEventLog: now queries all bootable
  systems (those with BootSourceOverrideTarget) instead of just the first
  system, consistent with GetSystems filtering from PR #825
- Simplify goroutine in pollAllBMCs: move WaitGroup and semaphore
  management into pollBMC function signature
- Add poll timeout protection: wrap pollAllBMCs calls with
  context.WithTimeout to prevent slow polls from overwhelming BMCs
- Fix test mock to include BootSourceOverrideTarget field
- Fix linter errors: explicitly ignore json.Encode errors in test mocks

Addresses review comments from @maxmoehl on PR #837.

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
@stefanhipfel stefanhipfel force-pushed the worktree-bmc-metrics branch from 4e30b43 to e138e0e Compare May 11, 2026 08:04
@github-actions github-actions Bot added the documentation Improvements or additions to documentation label May 11, 2026
@stefanhipfel

Copy link
Copy Markdown
Contributor Author

Fixed metric value extraction to use structured fields from Redfish MetricReport instead of hardcoded values.

Changes:

  • Extract real MetricID (e.g., "Temp1", "Fan1") instead of generic "Metric0", "Metric1"
  • Extract MetricProperty paths instead of using report-level ODataID
  • Extract MetricValue as string instead of stringifying entire object
  • Extract actual Timestamp instead of overwriting with current time
  • Defensive fallbacks preserve backward compatibility

All tests pass (26/26) and no lint issues.

Closes #813

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
Remove subscription link fields from generated manifests and docs.
Fix whitespace formatting in test file.

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
- Implement multi-system support in GetEventLog: now queries all bootable
  systems (those with BootSourceOverrideTarget) instead of just the first
  system, consistent with GetSystems filtering from PR #825
- Simplify goroutine in pollAllBMCs: move WaitGroup and semaphore
  management into pollBMC function signature
- Add poll timeout protection: wrap pollAllBMCs calls with
  context.WithTimeout to prevent slow polls from overwhelming BMCs
- Fix test mock to include BootSourceOverrideTarget field
- Fix linter errors: explicitly ignore json.Encode errors in test mocks

Addresses review comments from @maxmoehl on PR #837.

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
Replace hardcoded metric identifiers with actual field extraction from
gofish MetricReport. Now extracts MetricID, MetricProperty, MetricValue,
and Timestamp from Redfish response instead of using generic values.
Preserves backward compatibility with defensive fallbacks for missing fields.

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
Regenerate BMCStatus applyconfiguration to remove subscription link fields
that were removed in earlier commits of this PR.

Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
@stefanhipfel stefanhipfel force-pushed the worktree-bmc-metrics branch from 7ddbf85 to 9a81d7e Compare May 11, 2026 14:17
@stefanhipfel

Copy link
Copy Markdown
Contributor Author

Rebased on latest main to fix codegen check.

The branch was rebased on main (which includes PR #860 for applyconfiguration generation) and the generated applyconfiguration files have been updated to reflect the removal of subscription link fields from BMCStatus.

All checks should now pass:

  • ✅ Code generation (applyconfiguration updated)
  • ✅ All tests passing (26/26)
  • ✅ No lint issues

@asergeant01

Copy link
Copy Markdown
Contributor

I'm not sure I agree with the decision to remove the existing Event Subscription - is there a reason? As far as I am concerned the implementation of it is correct and it is a more desirable way to get metrics than polling. I do acknowledge the problem that vendors are not supporting it very well, but that is a vendor problem and not ours. If a user has a server that implements it correctly we are effectively stopping them from using a better way and therefore I personally would keep it as a configurable option.

The control over telemetry streaming is also going to be improved in redfish from looking at the work in progress document "Redfish Telemetry Streaming and Reporting Bundle" and the Open Compute Project Conference Talk

@asergeant01 asergeant01 added this to the v0.6.0 milestone May 12, 2026
@stefanhipfel

Copy link
Copy Markdown
Contributor Author

I'm not sure I agree with the decision to remove the existing Event Subscription - is there a reason? As far as I am concerned the implementation of it is correct and it is a more desirable way to get metrics than polling. I do acknowledge the problem that vendors are not supporting it very well, but that is a vendor problem and not ours. If a user has a server that implements it correctly we are effectively stopping them from using a better way and therefore I personally would keep it as a configurable option.

The control over telemetry streaming is also going to be improved in redfish from looking at the work in progress document "Redfish Telemetry Streaming and Reporting Bundle" and the Open Compute Project Conference Talk

maybe we discuss this briefly in our next meeting, to know how I should move forward
fyi @afritzler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change area/metal-automation documentation Improvements or additions to documentation size/XXL

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Support BMC monitoring without relying on event subscriptions

4 participants