Skip to content

Conversation

@rshade
Copy link
Contributor

@rshade rshade commented Oct 8, 2025

Summary

Fixed TeamStackPermission and TeamEnvironmentPermission resources to gracefully handle scenarios where teams are deleted externally (via SCIM/SSO) by treating 404 responses as resource deletions rather than fatal errors.

Problem

Organizations using SCIM/SSO for team management encountered unrecoverable failures when external identity providers deleted or renamed teams:

  • pulumi refresh and pulumi up failed with "404 API error: Not Found: Team not found"
  • Resources became stuck in state requiring manual pulumi state delete for potentially hundreds of permission resources
  • Replace operations could create "shadow" permissions due to create-before-delete ordering

Changes

API Client Layer (provider/pkg/pulumiapi/teams.go)

  • GetTeamStackPermission: Added 404 status check to return (nil, nil) when team doesn't exist (lines 377-381)
  • GetTeamEnvironmentSettings: Added 404 status check to return (nil, nil, nil) when team doesn't exist (lines 466-470)
  • Pattern follows existing GetTeam() method which already handled 404s gracefully

Resource Layer (provider/pkg/resources/team_stack_perm.go)

  • TeamStackPermission.Diff(): Added DeleteBeforeReplace: true flag (line 152) to prevent race conditions during replace operations
  • Matches existing pattern in TeamEnvironmentPermission resource

Testing (provider/pkg/pulumiapi/teams_test.go)

  • Added comprehensive test coverage for 404 scenarios:
    • TestGetTeamStackPermission/404_-_Team_not_found (lines 422-436)
    • TestGetTeamEnvironmentSettings/404_-_Team_not_found (lines 523-543)
  • Tests verify graceful handling: nil returns with no error
  • All existing tests continue to pass

Testing Done

✅ All 100+ provider tests pass
✅ New unit tests verify 404 handling behavior
✅ Linting passes across provider, sdk, and examples directories ✅ Resource Read() methods already handle nil responses correctly (verified in existing code)

Impact

  • pulumi refresh now succeeds when teams are deleted externally, removing permissions from state
  • No manual intervention required for team deletions
  • Replace operations complete cleanly without shadow resources
  • Self-healing behavior for SCIM/SSO-managed teams

Fixes #444

@rshade rshade requested a review from Copilot October 8, 2025 21:25
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a critical issue where TeamStackPermission and TeamEnvironmentPermission resources would fail when teams are deleted externally through SCIM/SSO systems. The fix enables graceful handling of 404 responses by treating them as resource deletions rather than fatal errors.

Key changes:

  • Modified API client methods to handle 404 responses gracefully by returning nil values instead of errors
  • Added DeleteBeforeReplace: true flag to prevent race conditions during resource replacement operations
  • Added comprehensive test coverage for the new 404 handling behavior

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
provider/pkg/pulumiapi/teams.go Added 404 status code handling in GetTeamStackPermission and GetTeamEnvironmentSettings methods
provider/pkg/resources/team_stack_perm.go Added DeleteBeforeReplace flag to prevent race conditions during replacements
provider/pkg/pulumiapi/teams_test.go Added comprehensive test coverage for 404 scenarios in both permission methods
CHANGELOG.md Added changelog entry documenting the bug fix

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-actions
Copy link

github-actions bot commented Oct 8, 2025

Does the PR have any schema changes?

Looking good! No breaking changes found.
No new resources/functions.

Maintainer note: consult the runbook for dealing with any breaking changes.

@rshade rshade force-pushed the teamenvpermdelete branch from 04b66d5 to 76262c6 Compare October 13, 2025 14:46
Fixed TeamStackPermission and TeamEnvironmentPermission resources to gracefully handle scenarios where teams are deleted externally (via SCIM/SSO) by treating 404 responses as resource deletions rather than fatal errors.

Organizations using SCIM/SSO for team management encountered unrecoverable failures when external identity providers deleted or renamed teams:
- `pulumi refresh` and `pulumi up` failed with "404 API error: Not Found: Team <teamname> not found"
- Resources became stuck in state requiring manual `pulumi state delete` for potentially hundreds of permission resources
- Replace operations could create "shadow" permissions due to create-before-delete ordering

- **GetTeamStackPermission**: Added 404 status check to return `(nil, nil)` when team doesn't exist (lines 377-381)
- **GetTeamEnvironmentSettings**: Added 404 status check to return `(nil, nil, nil)` when team doesn't exist (lines 466-470)
- Pattern follows existing `GetTeam()` method which already handled 404s gracefully

- **TeamStackPermission.Diff()**: Added `DeleteBeforeReplace: true` flag (line 152) to prevent race conditions during replace operations
- Matches existing pattern in TeamEnvironmentPermission resource

- Added comprehensive test coverage for 404 scenarios:
  - `TestGetTeamStackPermission/404_-_Team_not_found` (lines 422-436)
  - `TestGetTeamEnvironmentSettings/404_-_Team_not_found` (lines 523-543)
- Tests verify graceful handling: nil returns with no error
- All existing tests continue to pass

✅ All 100+ provider tests pass
✅ New unit tests verify 404 handling behavior
✅ Linting passes across provider, sdk, and examples directories
✅ Resource `Read()` methods already handle nil responses correctly (verified in existing code)

- `pulumi refresh` now succeeds when teams are deleted externally, removing permissions from state
- No manual intervention required for team deletions
- Replace operations complete cleanly without shadow resources
- Self-healing behavior for SCIM/SSO-managed teams

Fixes #444
@rshade rshade force-pushed the teamenvpermdelete branch from 76262c6 to cdcc389 Compare October 17, 2025 14:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deletion of team results in resources that cannot be removed

1 participant