azurerm_firewall_policy_rule_collection_group - use *WithoutTimeout CRUD to prevent timeout during mutex wait#32094
Open
sanderaernouts wants to merge 1 commit intohashicorp:mainfrom
Conversation
… CRUD to prevent timeout during mutex wait
Azure enforces serial processing on firewall policy rule collection groups
within the same policy. Concurrent requests receive HTTP 409
AnotherOperationInProgress. The provider already serializes these
operations with locks.ByName, but the SDK-managed timeout context started
before lock acquisition. Time spent waiting on the mutex consumed the
30-minute timeout budget, causing later-queued operations to fail with
context.DeadlineExceeded.
Switch from Create/Read/Update/Delete to CreateWithoutTimeout/
ReadWithoutTimeout/UpdateWithoutTimeout/DeleteWithoutTimeout so the
resource controls its own timeout context. The lock is now acquired first,
and the timeout starts only after the lock is held.
This is the pattern the SDK recommends for cross-resource synchronization:
"there are cases where operation synchronization across concurrent
resources is necessary in the resource logic, such as a mutex, to
prevent remote system errors. Since these operations would have an
indeterminate timeout that scales with the number of resources, this
allows resources to control timeout behavior."
-- terraform-plugin-sdk CreateWithoutTimeout documentation
https://pkg.go.dev/github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema#Resource.CreateWithoutTimeout
https://github.com/hashicorp/terraform-plugin-sdk/blob/main/helper/schema/resource.go#L381-L413
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Azure enforces serial processing on firewall policy rule collection groups within the same policy. Concurrent requests receive HTTP 409
AnotherOperationInProgress. The provider already serializes these operations withlocks.ByName, but the SDK-managed timeout context started before lock acquisition. Time spent waiting on the mutex consumed the 30-minute budget, causing later-queued operations to fail withcontext.DeadlineExceeded.This switches from
Create/Read/Update/DeletetoCreateWithoutTimeout/ReadWithoutTimeout/UpdateWithoutTimeout/DeleteWithoutTimeoutso the resource controls its own timeout context. The lock is acquired first; the timeout starts only after the lock is held.This is the pattern the SDK recommends for cross-resource synchronization:
Changes
*WithoutTimeoutvariants instead of the legacyCreate/Read/Update/DeletefieldsCreateUpdateandDelete: acquire mutex before creating the timeout contextfunc(ctx context.Context, d *pluginsdk.ResourceData, meta interface{}) diag.Diagnosticsdiag.FromErr()Supersedes
This replaces the approach in #32081, which moved the lock before the timeout but kept the legacy CRUD registration. Maintainers correctly pointed out that the timeout must cover the full operation. The
*WithoutTimeoutpattern achieves both: the timeout covers the full API operation, and lock wait time does not eat into the timeout budget.