Skip to content

[FEA]: Allow Tenants to learn if Machines associated with Instances have health issue/needs evacuation #178

@thossain-nv

Description

@thossain-nv

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

High

Please provide a clear description of problem this feature solves

Machine health alerts are currently not exposed to Tenants since they may contain alerts are not meant for Tenant consumption or may have internal Provider notes.

However Tenants should be able to programmatically learn whether the underlying Machine for an Instance has hardware issues and will need to be taken offline.

Feature Description

Allows Tenants to receive more information about the health state of Machines associated with their Instances through Machine of Instance API endpoint.

Describe your ideal solution

There are few possible solutions:

  • Selectively expose Machine health alerts to Tenants (right now they don't see any alerts because it might contain internal info) over Machine endpoint
  • Expose health alerts for Instance endpoint
  • Expose a flag for Instance that indicates that Tenant should evacuate the Machine

Describe any alternatives you have considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow Carbide's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request

Metadata

Metadata

Assignees

Labels

featureFeature (deprecated - use issue type, but it's needed for reporting now)

Projects

Status

Needs Design

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions