OCM-24713 | feat(machine-pool): add node_drain_grace_period attribute by amandahla · Pull Request #145 · terraform-redhat/terraform-rhcs-rosa-hcp

amandahla · 2026-05-15T19:32:16Z

PR Summary

Adds optional node_drain_grace_period field to aws_node_pool in the machine-pool module and raises the minimum rhcs provider constraint to >= 1.7.7 where the attribute is first supported.

Detailed Description of the Issue

rhcs_hcp_machine_pool exposes node_drain_grace_period (minutes) since provider v1.7.7, but the machine-pool module's aws_node_pool variable did not surface it, leaving consumers unable to control how long the provider waits before forcibly terminating draining nodes. This PR
wires the field through, validates the accepted range (0–10080), and bumps both the module and root provider minimum constraints from >= 1.7.3/1.7.6 to >= 1.7.7.

Also included: tooling fixes to Makefile and scripts/verify-gen.sh that make make verify-gen and make lint more robust in isolated CI environments (Prow), replacing git-status-based doc drift detection with SHA-based hashing and isolating terraform init to a temp
directory.

Related Issues and PRs

Jira: OCM-24713
Fixes: #
Related PR(s):
Related design/docs:

Type of Change

Previous Behavior

aws_node_pool had no node_drain_grace_period field. The rhcs provider used its own default drain timeout; module consumers had no way to override it.

Behavior After This Change

aws_node_pool.node_drain_grace_period (optional number) accepts 0–10080 minutes and is passed directly to rhcs_hcp_machine_pool. Omitting it preserves the provider default. Values outside [0, 10080] fail terraform plan with a validation error.

How to Test (Step-by-Step)

Preconditions

Terraform >= 1.5.7
rhcs provider >= 1.7.7
RHCS_TOKEN and AWS credentials set

Test Steps

make pre-push-checks
cd modules/machine-pool && terraform test (runs aws_node_pool.tftest.hcl including the three new node_drain_grace_period run blocks)
In a live environment: deploy a machine pool with node_drain_grace_period = 60 and confirm the value appears in terraform show output.

Expected Results

All unit tests pass, including invalid_node_drain_grace_period_fails, valid_node_drain_grace_period_plan, and node_drain_grace_period_null_plan.
Setting node_drain_grace_period = 10081 fails plan validation.
Live deployment reflects the configured drain timeout.

Proof of the Fix

Screenshots:
Videos:
Logs/CLI output:
Other artifacts:

Breaking Changes

No breaking changes
Yes, this PR introduces a breaking change (describe impact and migration plan below)

Breaking Change Details / Migration Plan

The minimum rhcs provider constraint is raised from >= 1.7.3 (module) / >= 1.7.6 (root) to >= 1.7.7. Consumers pinned below 1.7.7 must upgrade their provider before applying. The aws_node_pool interface change is additive (new optional field with null default).

Developer Verification Checklist

Summary by CodeRabbit

Release Notes

New Features
- Added support for configuring node drain grace period in machine pools.
- Added capacity reservation preference option for AWS node pools.
Tests
- Added test coverage for node drain grace period behavior.
Chores
- Updated minimum rhcs provider version to 1.7.7.
- Reordered pre-push validation checks.

coderabbitai · 2026-05-15T19:32:27Z

Walkthrough

This PR extends the aws_node_pool schema with an optional node_drain_grace_period field, adds validation rules and test coverage, updates documentation and provider version requirements, and reorders the Makefile pre-push-checks target for faster feedback iteration.

Changes

aws_node_pool schema extensions

Layer / File(s)	Summary
Variable schema and validation `variables.tf`, `modules/machine-pool/variables.tf`	Root and module `aws_node_pool` objects gain optional `node_drain_grace_period` field. Module-level validation enforces 0–10080 minute range, integer values, and non-negative constraint with distinct error messages.
node_drain_grace_period test coverage `modules/machine-pool/tests/aws_node_pool.tftest.hcl`	Mock provider defaults set `node_drain_grace_period` to null. Three plan-run tests verify validation failure at 10081, success and wiring at 60, and explicit null wiring.
Documentation and provider version `modules/machine-pool/README.md`, `README.md`, `modules/machine-pool/versions.tf`	Module README documents new field and updates rhcs provider requirement from ≥1.7.3 to ≥1.7.7. Root README extends `aws_node_pool` schema documentation with `capacity_reservation_preference` field.

CI/Build tooling

Layer / File(s)	Summary
pre-push-checks target reordering `Makefile`	Reordered recipe execution: `license-check` and `docs-lint` run first for rapid feedback, followed by `verify-gen`, `unit-tests`, `lint`, with `verify` last.

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

lgtm

Suggested reviewers

gdbranco

🚥 Pre-merge checks | ✅ 6

✅ Passed checks (6 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly identifies the main feature addition: the new node_drain_grace_period attribute for the machine-pool module, directly corresponding to the primary changes in the PR.
Description check	✅ Passed	The description follows the template structure, covers the problem (missing field exposure), why it's needed (consumer control), what changed (field addition and provider bump), and testing approach. Most required sections are completed.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Pr Checklist Claims Vs Evidence (Generic)	✅ Passed	All 6 checked items satisfied: commit format correct, detailed description present, Jira issue linked, tests added, documentation updated, migration plan documented.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

variables.tf (2)
415-415: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document node_drain_grace_period range and requirements in the variable description.

The machine_pools description doesn't mention the valid range (0–10080 minutes) or provider requirements for the new node_drain_grace_period field. Users will encounter validation errors without clear guidance from the variable documentation. As per coding guidelines, document minimum OpenShift version requirements in the variable description when a feature needs a specific minimum version.

Consider updating the description to include the valid range and any version/provider requirements mentioned in the PR summary (RHCS provider >= 1.7.6).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@variables.tf` at line 415, Update the machine_pools variable description to
document the new node_drain_grace_period field: state that
node_drain_grace_period is specified in minutes and must be within the range
0–10080, indicate whether 0 disables graceful drain, and add provider/version
requirements (RHCS provider >= 1.7.6) plus the minimum OpenShift version needed
for this feature; reference the machine_pools variable and the
node_drain_grace_period field in the description so users get validation
guidance and required versions.
383-416: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add root-level validation for node_drain_grace_period to catch errors early.

The validation for node_drain_grace_period (0–10080 range) exists only at the module level. Users won't see validation errors until module instantiation, which delays feedback. As per coding guidelines, add root validation blocks for cross-field map validation rules that users hit early; child modules may keep lifecycle precondition as a second line of defense.
🛡️ Proposed validation block to add
   default     = {}
   description = "Provides a typed map to add multiple machine pools after cluster creation. Each key is an arbitrary label; each value aligns with the [machine-pool](./modules/machine-pool) submodule (required: name, subnet_id, openshift_version, aws_node_pool). Optional fields match that module's optional inputs; omit autoscaling to use a fixed replica count with autoscaling disabled."
+
+  validation {
+    condition = alltrue([
+      for _, mp in var.machine_pools :
+      mp.aws_node_pool.node_drain_grace_period == null ? true : (
+        mp.aws_node_pool.node_drain_grace_period >= 0 &&
+        mp.aws_node_pool.node_drain_grace_period <= 10080
+      )
+    ])
+    error_message = "node_drain_grace_period must be between 0 and 10080 minutes (7 days)."
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@variables.tf` around lines 383 - 416, Add a root-level validation block to
variable "machine_pools" that checks each map entry's
aws_node_pool.node_drain_grace_period (when present) is between 0 and 10080;
implement the condition using a for-expression over var.machine_pools and
try(...) to allow missing values (e.g., require either
try(entry.value.aws_node_pool.node_drain_grace_period, null) == null or the
numeric range check), and provide a clear error_message mentioning machine_pools
and aws_node_pool.node_drain_grace_period so users get immediate validation
before module instantiation.

🧹 Nitpick comments (1)

modules/machine-pool/tests/aws_node_pool.tftest.hcl (1)

157-235: ⚡ Quick win

Consider adding boundary value tests for comprehensive coverage.

The current tests cover invalid (>10080), valid (60), and null cases. Consider adding tests for the boundary values (0, 10080) and a negative value (-1) to ensure the validation correctly handles edge cases.

🧪 Suggested additional test cases

# Test minimum boundary value (0 minutes)
run "node_drain_grace_period_zero_plan" {
  command = plan

  providers = {
    rhcs = rhcs.no_override
  }

  variables {
    cluster_id        = "fake-cluster-123"
    name              = "test-pool"
    subnet_id         = "subnet-fake123"
    openshift_version = "4.15.0"

    aws_node_pool = {
      instance_type           = "m5.xlarge"
      tags                    = {}
      node_drain_grace_period = 0
    }
  }

  assert {
    condition     = var.aws_node_pool.node_drain_grace_period == 0
    error_message = "Expected node_drain_grace_period to be 0."
  }
}

# Test maximum boundary value (10080 minutes)
run "node_drain_grace_period_max_plan" {
  command = plan

  providers = {
    rhcs = rhcs.no_override
  }

  variables {
    cluster_id        = "fake-cluster-123"
    name              = "test-pool"
    subnet_id         = "subnet-fake123"
    openshift_version = "4.15.0"

    aws_node_pool = {
      instance_type           = "m5.xlarge"
      tags                    = {}
      node_drain_grace_period = 10080
    }
  }

  assert {
    condition     = var.aws_node_pool.node_drain_grace_period == 10080
    error_message = "Expected node_drain_grace_period to be 10080."
  }
}

# Test negative value fails validation
run "negative_node_drain_grace_period_fails" {
  command = plan

  providers = {
    rhcs = rhcs.no_override
  }

  variables {
    cluster_id        = "fake-cluster-123"
    name              = "test-pool"
    subnet_id         = "subnet-fake123"
    openshift_version = "4.15.0"

    aws_node_pool = {
      instance_type           = "m5.xlarge"
      tags                    = {}
      node_drain_grace_period = -1
    }
  }

  expect_failures = [
    var.aws_node_pool,
  ]
}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@modules/machine-pool/tests/aws_node_pool.tftest.hcl` around lines 157 - 235,
Add boundary and negative-value tests for aws_node_pool.node_drain_grace_period:
create new tftest runs similar to existing ones using run names like
"node_drain_grace_period_zero_plan", "node_drain_grace_period_max_plan", and
"negative_node_drain_grace_period_fails"; for the zero and max cases set
aws_node_pool.node_drain_grace_period to 0 and 10080 respectively and add
asserts that the variable equals those values, and for the negative case set it
to -1 and include expect_failures referencing var.aws_node_pool to ensure
validation rejects negative values.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@variables.tf`:
- Line 415: Update the machine_pools variable description to document the new
node_drain_grace_period field: state that node_drain_grace_period is specified
in minutes and must be within the range 0–10080, indicate whether 0 disables
graceful drain, and add provider/version requirements (RHCS provider >= 1.7.6)
plus the minimum OpenShift version needed for this feature; reference the
machine_pools variable and the node_drain_grace_period field in the description
so users get validation guidance and required versions.
- Around line 383-416: Add a root-level validation block to variable
"machine_pools" that checks each map entry's
aws_node_pool.node_drain_grace_period (when present) is between 0 and 10080;
implement the condition using a for-expression over var.machine_pools and
try(...) to allow missing values (e.g., require either
try(entry.value.aws_node_pool.node_drain_grace_period, null) == null or the
numeric range check), and provide a clear error_message mentioning machine_pools
and aws_node_pool.node_drain_grace_period so users get immediate validation
before module instantiation.

---

Nitpick comments:
In `@modules/machine-pool/tests/aws_node_pool.tftest.hcl`:
- Around line 157-235: Add boundary and negative-value tests for
aws_node_pool.node_drain_grace_period: create new tftest runs similar to
existing ones using run names like "node_drain_grace_period_zero_plan",
"node_drain_grace_period_max_plan", and
"negative_node_drain_grace_period_fails"; for the zero and max cases set
aws_node_pool.node_drain_grace_period to 0 and 10080 respectively and add
asserts that the variable equals those values, and for the negative case set it
to -1 and include expect_failures referencing var.aws_node_pool to ensure
validation rejects negative values.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 0e7a0434-1cf8-4b44-aa8b-195ec0e02860

📥 Commits

Reviewing files that changed from the base of the PR and between 6a843a1 and 603866d.

📒 Files selected for processing (5)

README.md
modules/machine-pool/README.md
modules/machine-pool/tests/aws_node_pool.tftest.hcl
modules/machine-pool/variables.tf
variables.tf

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@variables.tf`:
- Line 394: Remove the unsupported variable node_drain_grace_period from
variables.tf and any references to it in the module configuration and calls
(e.g., where it would be passed into rhcs_hcp_machine_pool); the
rhcs_hcp_machine_pool resource in terraform-redhat/rhcs v1.7.6 does not support
node_drain_grace_period, so delete the variable declaration and any usages to
match the upstream provider schema.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: d1c1b21f-cf6e-46b9-8537-29fc7eeba296

📥 Commits

Reviewing files that changed from the base of the PR and between 603866d and b5bdcd0.

📒 Files selected for processing (5)

README.md
modules/machine-pool/README.md
modules/machine-pool/tests/aws_node_pool.tftest.hcl
modules/machine-pool/variables.tf
variables.tf

✅ Files skipped from review due to trivial changes (2)

README.md
modules/machine-pool/tests/aws_node_pool.tftest.hcl

🚧 Files skipped from review as they are similar to previous changes (2)

modules/machine-pool/variables.tf
modules/machine-pool/README.md

amandahla · 2026-06-09T14:38:28Z

@coderabbitai review the changes

coderabbitai · 2026-06-09T14:39:52Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== Root variables.tf ===" 
cat variables.tf

echo ""
echo "=== modules/machine-pool/variables.tf ==="
cat modules/machine-pool/variables.tf

echo ""
echo "=== modules/machine-pool/main.tf ==="
cat modules/machine-pool/main.tf

echo ""
echo "=== modules/machine-pool/tests/aws_node_pool.tftest.hcl ==="
cat modules/machine-pool/tests/aws_node_pool.tftest.hcl

echo ""
echo "=== modules/machine-pool/README.md (node_drain section) ==="
grep -n "node_drain" modules/machine-pool/README.md

echo ""
echo "=== README.md (node_drain section) ==="
grep -n "node_drain" README.md

Length of output: 36546

@amandahla: Here is my review of the PR changes.

Review: OCM-24713 — `node_drain_grace_period` attribute

Overall

The implementation is clean and well-structured. The new attribute is correctly declared, validated, and threaded through (via the whole-object pass-through aws_node_pool = var.aws_node_pool in main.tf). The three new test cases cover the key scenarios. A few minor observations below.

`modules/machine-pool/variables.tf` & `variables.tf`

1. optional(number) accepts fractional values

Terraform's number type accepts floats (e.g., 60.5). The attribute semantics are "integer minutes," but there is no validation guard against a non-integer value like 1440.5. If the RHCS provider's API enforces integer-only values, Terraform would accept the input and then fail at apply time with a provider-level error instead of a clear validation message.

Consider adding a floor/ceiling check:

validation {
  condition = var.aws_node_pool.node_drain_grace_period == null ? true : (
    var.aws_node_pool.node_drain_grace_period >= 0 &&
    var.aws_node_pool.node_drain_grace_period <= 10080 &&
    floor(var.aws_node_pool.node_drain_grace_period) == var.aws_node_pool.node_drain_grace_period
  )
  error_message = "node_drain_grace_period must be a whole number between 0 and 10080 minutes (7 days)."
}

The same applies to the root variables.tf validation block.

`modules/machine-pool/tests/aws_node_pool.tftest.hcl`

2. No boundary-value tests (0 and 10080)

The valid_node_drain_grace_period_plan run tests 60, which is well within range. It would be good practice to also test the inclusive boundaries 0 (minimum) and 10080 (maximum) to confirm the >=/<= operators are correctly anchored. A single extra run with 0 is the most critical edge case (a "disable drain" semantic).

3. No test for a negative value

A test with e.g. node_drain_grace_period = -1 with expect_failures = [var.aws_node_pool] would explicitly document that negative inputs are rejected. This is especially useful given that 0 carries valid meaning (immediate eviction).

4. Inconsistent provider alias between valid_node_drain_grace_period_plan and node_drain_grace_period_null_plan

valid_node_drain_grace_period_plan uses rhcs.no_override while node_drain_grace_period_null_plan uses rhcs.with_override. The reason is sensible (a null computed field needs the mock default to be assertable), but a short inline comment would make the intent clear for future contributors:

# Uses with_override so the computed aws_node_pool object is fully known at plan time,
# allowing the null assertion to be evaluated.
run "node_drain_grace_period_null_plan" {

`modules/machine-pool/main.tf`

No changes needed — node_drain_grace_period flows automatically because aws_node_pool is passed as a whole object.

Documentation

Both README.md and modules/machine-pool/README.md are correctly regenerated and show the new node_drain_grace_period = optional(number) field.

Summary

Severity	Item
⚠️ Minor	`number` type allows floats; consider adding an integer guard in both validation blocks
💡 Suggestion	Add boundary tests for `0` and `10080`
💡 Suggestion	Add a negative-value rejection test
💡 Suggestion	Add a comment explaining why `with_override` is used for the null test

amandahla · 2026-06-09T16:10:40Z

/retest-required

amandahla · 2026-06-09T16:12:01Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
echo "=== Root variables.tf ===" 
cat variables.tf

echo ""
echo "=== modules/machine-pool/variables.tf ==="
cat modules/machine-pool/variables.tf

echo ""
echo "=== modules/machine-pool/main.tf ==="
cat modules/machine-pool/main.tf

echo ""
echo "=== modules/machine-pool/tests/aws_node_pool.tftest.hcl ==="
cat modules/machine-pool/tests/aws_node_pool.tftest.hcl

echo ""
echo "=== modules/machine-pool/README.md (node_drain section) ==="
grep -n "node_drain" modules/machine-pool/README.md

echo ""
echo "=== README.md (node_drain section) ==="
grep -n "node_drain" README.md
Length of output: 36546

@amandahla: Here is my review of the PR changes.

Review: OCM-24713 — node_drain_grace_period attribute

Overall

The implementation is clean and well-structured. The new attribute is correctly declared, validated, and threaded through (via the whole-object pass-through aws_node_pool = var.aws_node_pool in main.tf). The three new test cases cover the key scenarios. A few minor observations below.

modules/machine-pool/variables.tf & variables.tf

1. optional(number) accepts fractional values

Terraform's number type accepts floats (e.g., 60.5). The attribute semantics are "integer minutes," but there is no validation guard against a non-integer value like 1440.5. If the RHCS provider's API enforces integer-only values, Terraform would accept the input and then fail at apply time with a provider-level error instead of a clear validation message.

Consider adding a floor/ceiling check:
validation {
  condition = var.aws_node_pool.node_drain_grace_period == null ? true : (
    var.aws_node_pool.node_drain_grace_period >= 0 &&
    var.aws_node_pool.node_drain_grace_period <= 10080 &&
    floor(var.aws_node_pool.node_drain_grace_period) == var.aws_node_pool.node_drain_grace_period
  )
  error_message = "node_drain_grace_period must be a whole number between 0 and 10080 minutes (7 days)."
}
The same applies to the root variables.tf validation block.

modules/machine-pool/tests/aws_node_pool.tftest.hcl

2. No boundary-value tests (0 and 10080)

The valid_node_drain_grace_period_plan run tests 60, which is well within range. It would be good practice to also test the inclusive boundaries 0 (minimum) and 10080 (maximum) to confirm the >=/<= operators are correctly anchored. A single extra run with 0 is the most critical edge case (a "disable drain" semantic).

3. No test for a negative value

A test with e.g. node_drain_grace_period = -1 with expect_failures = [var.aws_node_pool] would explicitly document that negative inputs are rejected. This is especially useful given that 0 carries valid meaning (immediate eviction).

4. Inconsistent provider alias between valid_node_drain_grace_period_plan and node_drain_grace_period_null_plan

valid_node_drain_grace_period_plan uses rhcs.no_override while node_drain_grace_period_null_plan uses rhcs.with_override. The reason is sensible (a null computed field needs the mock default to be assertable), but a short inline comment would make the intent clear for future contributors:
# Uses with_override so the computed aws_node_pool object is fully known at plan time,
# allowing the null assertion to be evaluated.
run "node_drain_grace_period_null_plan" {
modules/machine-pool/main.tf

No changes needed — node_drain_grace_period flows automatically because aws_node_pool is passed as a whole object.

Documentation

Both README.md and modules/machine-pool/README.md are correctly regenerated and show the new node_drain_grace_period = optional(number) field.

Summary

Severity Item
⚠️ Minor number type allows floats; consider adding an integer guard in both validation blocks
To be addressed by the provider.
💡 Suggestion Add boundary tests for 0 and 10080
Nitpicking
💡 Suggestion Add a negative-value rejection test
Nitpicking
💡 Suggestion Add a comment explaining why with_override is used for the null test
Nitpicking

olucasfreitas · 2026-06-09T17:05:41Z

The provider already models this as Int64, but the module still accepts 60.5 and only rejects it downstream in provider validation. Since Jira and the provider docs describe whole minutes, I’d still consider a simple integer guard worthwhile if we want the module’s fail-fast behavior to match the public contract.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Amanda Hager Lopes de Andrade Katz <amanda.katz@redhat.com>

amandahla · 2026-06-09T19:03:44Z

The provider already models this as Int64, but the module still accepts 60.5 and only rejects it downstream in provider validation. Since Jira and the provider docs describe whole minutes, I’d still consider a simple integer guard worthwhile if we want the module’s fail-fast behavior to match the public contract.

Good catch, I added one for integer and positive number

coderabbitai

♻️ Duplicate comments (1)

variables.tf (1)

389-422: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add root fail-fast validation for machine_pools[*].aws_node_pool.node_drain_grace_period.

Line 400 widens the public API, but this root variable block still has no validation for nullable-safe bounds (and whole-minute values). That pushes invalid inputs to downstream/module failures instead of failing at the root boundary.

Suggested patch

 variable "machine_pools" {
   type = map(object({
@@
   default     = {}
   description = "Provides a typed map to add multiple machine pools after cluster creation. Each key is an arbitrary label; each value aligns with the [machine-pool](./modules/machine-pool) submodule (required: name, subnet_id, openshift_version, aws_node_pool). Optional fields match that module's optional inputs; omit autoscaling to use a fixed replica count with autoscaling disabled."
+
+  validation {
+    condition = alltrue([
+      for _, mp in var.machine_pools :
+      mp.aws_node_pool.node_drain_grace_period == null ? true : (
+        mp.aws_node_pool.node_drain_grace_period >= 0 &&
+        mp.aws_node_pool.node_drain_grace_period <= 10080 &&
+        mp.aws_node_pool.node_drain_grace_period == floor(mp.aws_node_pool.node_drain_grace_period)
+      )
+    ])
+    error_message = "Each machine_pools.aws_node_pool.node_drain_grace_period must be null or a whole number between 0 and 10080 (minutes)."
+  }
 }

As per coding guidelines, “variables.tf: Add root validation blocks for cross-field map validation rules that users hit early; child modules may keep lifecycle precondition as a second line of defense” and “Variable validation with nullable values: ... use short-circuiting ... when the value is non-null.”

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@variables.tf` around lines 389 - 422, Add a validation block to the root
variable "machine_pools" that iterates its map values and enforces nullable-safe
bounds and whole-minute semantics for aws_node_pool.node_drain_grace_period: for
each entry (refer to variable "machine_pools" and field
aws_node_pool.node_drain_grace_period) short-circuit when the value is null,
otherwise require it to be a non-negative integer and divisible by 60 (e.g.,
using all(keys(var.machine_pools), k ->
(var.machine_pools[k].aws_node_pool.node_drain_grace_period == null) ||
(var.machine_pools[k].aws_node_pool.node_drain_grace_period >= 0 &&
var.machine_pools[k].aws_node_pool.node_drain_grace_period % 60 == 0))); include
a clear validation error_message describing the allowed values.

Source: Coding guidelines

🧹 Nitpick comments (1)

Makefile (1)
71-71: 💤 Low value

Consider updating the comment to reflect the new execution order.

The comment lists checks in the old order ("verify, verify-gen, lint, unit-tests, license-check, docs-lint") while the code now executes them as: license-check → docs-lint → verify-gen → unit-tests → lint → verify. Updating the comment to match execution order would improve clarity for future maintainers.
📝 Suggested comment update
-# Merge gate: verify, verify-gen, lint, unit-tests, license-check, docs-lint (fail-fast).
+# Merge gate: license-check, docs-lint, verify-gen, unit-tests, lint, verify (fail-fast, ordered by speed).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Makefile` at line 71, Update the merge gate comment text on the Makefile line
that currently reads "verify, verify-gen, lint, unit-tests, license-check,
docs-lint (fail-fast)" so it matches the actual execution order: "license-check
→ docs-lint → verify-gen → unit-tests → lint → verify (fail-fast)"; edit the
comment string in the same location to replace the old sequence with the new
sequence exactly as shown so future maintainers see the correct order.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@variables.tf`:
- Around line 389-422: Add a validation block to the root variable
"machine_pools" that iterates its map values and enforces nullable-safe bounds
and whole-minute semantics for aws_node_pool.node_drain_grace_period: for each
entry (refer to variable "machine_pools" and field
aws_node_pool.node_drain_grace_period) short-circuit when the value is null,
otherwise require it to be a non-negative integer and divisible by 60 (e.g.,
using all(keys(var.machine_pools), k ->
(var.machine_pools[k].aws_node_pool.node_drain_grace_period == null) ||
(var.machine_pools[k].aws_node_pool.node_drain_grace_period >= 0 &&
var.machine_pools[k].aws_node_pool.node_drain_grace_period % 60 == 0))); include
a clear validation error_message describing the allowed values.

---

Nitpick comments:
In `@Makefile`:
- Line 71: Update the merge gate comment text on the Makefile line that
currently reads "verify, verify-gen, lint, unit-tests, license-check, docs-lint
(fail-fast)" so it matches the actual execution order: "license-check →
docs-lint → verify-gen → unit-tests → lint → verify (fail-fast)"; edit the
comment string in the same location to replace the old sequence with the new
sequence exactly as shown so future maintainers see the correct order.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 10d68e85-af56-48b4-b6c6-44e2f105f47f

📥 Commits

Reviewing files that changed from the base of the PR and between 22af8ce and a7a0d4a.

📒 Files selected for processing (7)

Makefile
README.md
modules/machine-pool/README.md
modules/machine-pool/tests/aws_node_pool.tftest.hcl
modules/machine-pool/variables.tf
modules/machine-pool/versions.tf
variables.tf

✅ Files skipped from review due to trivial changes (1)

README.md

🚧 Files skipped from review as they are similar to previous changes (4)

modules/machine-pool/versions.tf
modules/machine-pool/variables.tf
modules/machine-pool/tests/aws_node_pool.tftest.hcl
modules/machine-pool/README.md

olucasfreitas · 2026-06-09T21:40:52Z

/lgtm
/approve

openshift-ci · 2026-06-09T21:40:59Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: amandahla, olucasfreitas

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [amandahla,olucasfreitas]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

amandahla · 2026-06-10T14:38:40Z

/retest-required

amandahla · 2026-06-10T16:32:25Z

/retest-required

amandahla · 2026-06-10T19:41:30Z

/override ci/prow/rosa-hcp-private
/override ci/prow/rosa-hcp-public

Failed due error " The vpc 'vpc-02763f495de8418b5' has dependencies and cannot be deleted.", not related to change.

openshift-ci · 2026-06-10T19:41:36Z

@amandahla: Overrode contexts on behalf of amandahla: ci/prow/rosa-hcp-private, ci/prow/rosa-hcp-public

Details

In response to this:

/override ci/prow/rosa-hcp-private
/override ci/prow/rosa-hcp-public

Failed due error " The vpc 'vpc-02763f495de8418b5' has dependencies and cannot be deleted.", not related to change.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci Bot requested review from davidleerh and gdbranco May 15, 2026 19:32

openshift-ci Bot added the dco-signoff: yes label May 15, 2026

openshift-ci Bot added the approved label May 15, 2026

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

amandahla marked this pull request as draft May 15, 2026 19:39

openshift-ci Bot added the do-not-merge/work-in-progress label May 15, 2026

amandahla force-pushed the OCM-24713-node-drain-grace-period branch from 603866d to b5bdcd0 Compare May 18, 2026 16:54

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

Comment thread variables.tf

amandahla force-pushed the OCM-24713-node-drain-grace-period branch from b5bdcd0 to 34eadb4 Compare June 8, 2026 18:00

amandahla marked this pull request as ready for review June 8, 2026 18:02

openshift-ci Bot removed the do-not-merge/work-in-progress label Jun 8, 2026

olucasfreitas reviewed Jun 8, 2026

View reviewed changes

Comment thread variables.tf

olucasfreitas reviewed Jun 8, 2026

View reviewed changes

Comment thread versions.tf

olucasfreitas reviewed Jun 8, 2026

View reviewed changes

Comment thread modules/machine-pool/tests/aws_node_pool.tftest.hcl Outdated

amandahla force-pushed the OCM-24713-node-drain-grace-period branch from 34eadb4 to 22af8ce Compare June 9, 2026 14:36

olucasfreitas reviewed Jun 9, 2026

View reviewed changes

Comment thread tests/machine_pools.tftest.hcl Outdated

OCM-24713 | feat(machine-pool): add node_drain_grace_period attribute

a7a0d4a

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: Amanda Hager Lopes de Andrade Katz <amanda.katz@redhat.com>

amandahla force-pushed the OCM-24713-node-drain-grace-period branch from 22af8ce to a7a0d4a Compare June 9, 2026 19:02

coderabbitai Bot reviewed Jun 9, 2026

View reviewed changes

amandahla requested a review from olucasfreitas June 9, 2026 20:49

openshift-ci Bot assigned olucasfreitas Jun 9, 2026

openshift-ci Bot added the lgtm label Jun 9, 2026

openshift-merge-bot Bot merged commit 8f49326 into terraform-redhat:main Jun 10, 2026
12 checks passed

Uh oh!

Conversation

amandahla commented May 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Detailed Description of the Issue

Related Issues and PRs

Type of Change

Previous Behavior

Behavior After This Change

How to Test (Step-by-Step)

Preconditions

Test Steps

Expected Results

Proof of the Fix

Breaking Changes

Breaking Change Details / Migration Plan

Developer Verification Checklist

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amandahla commented Jun 9, 2026

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review: OCM-24713 — node_drain_grace_period attribute

Overall

modules/machine-pool/variables.tf & variables.tf

modules/machine-pool/tests/aws_node_pool.tftest.hcl

modules/machine-pool/main.tf

Documentation

Summary

Uh oh!

amandahla commented Jun 9, 2026

Uh oh!

amandahla commented Jun 9, 2026

Review: OCM-24713 — node_drain_grace_period attribute

Overall

modules/machine-pool/variables.tf & variables.tf

modules/machine-pool/tests/aws_node_pool.tftest.hcl

modules/machine-pool/main.tf

Documentation

Summary

Uh oh!

Uh oh!

olucasfreitas commented Jun 9, 2026

Uh oh!

amandahla commented Jun 9, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

olucasfreitas commented Jun 9, 2026

Uh oh!

openshift-ci Bot commented Jun 9, 2026

Uh oh!

amandahla commented Jun 10, 2026

Uh oh!

amandahla commented Jun 10, 2026

Uh oh!

amandahla commented Jun 10, 2026

Uh oh!

openshift-ci Bot commented Jun 10, 2026

amandahla commented May 15, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 15, 2026 •

edited

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited by openshift-ci Bot

Loading

Review: OCM-24713 — `node_drain_grace_period` attribute

`modules/machine-pool/variables.tf` & `variables.tf`

`modules/machine-pool/tests/aws_node_pool.tftest.hcl`

`modules/machine-pool/main.tf`

Review: OCM-24713 — `node_drain_grace_period` attribute

`modules/machine-pool/variables.tf` & `variables.tf`

`modules/machine-pool/tests/aws_node_pool.tftest.hcl`

`modules/machine-pool/main.tf`