fix(#2017): unenroll repos during uninstall via repo-maintenance workflow#2020
Conversation
…flow Previously, EnrollmentLayer.Uninstall() was a no-op — enrolled repos kept their shim workflows and .fullsend references after running fullsend admin uninstall. This left stale agent configuration in repos when users later reinstalled with a different set of agents. The fix makes two changes: 1. runUninstall() now extracts enabled repos from config.yaml and passes them as disabledRepos to the EnrollmentLayer. 2. EnrollmentLayer.Uninstall() now updates config.yaml to mark all repos as disabled, then dispatches the repo-maintenance workflow to create unenrollment PRs that remove shim workflows from each enrolled repo. Errors are non-fatal — the uninstall continues and the user is informed of any repos needing manual cleanup. The unenrollment runs before ConfigRepoLayer deletes the .fullsend repo (layers uninstall in reverse order), so the workflow is still available to dispatch. Note: pre-commit could not run in the sandbox (shellcheck hook failed to install due to network restrictions). The post-script runs pre-commit authoritatively on the runner. Closes #2017
Site previewPreview: https://74789b30-site.fullsend-ai.workers.dev Commit: |
ralphbean
left a comment
There was a problem hiding this comment.
I think this needs a small change before we merge. See inline comments.
Replace the manual loop over disabledRepos with a call to the existing reportReconciliationPRs method, which already iterates both enabledRepos and disabledRepos with the correct PR titles. This avoids duplicating the title string that must match UNENROLL_PR_TITLE in reconcile-repos.sh. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
|
🤖 Review · Started 9:09 PM UTC |
The comment claimed only ConfigRepoLayer matters for uninstall since other layers are no-ops. This is no longer true now that EnrollmentLayer.Uninstall() does real work. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Ralph Bean <rbean@redhat.com>
|
🤖 Finished Review · ✅ Success · Started 9:18 PM UTC · Completed 9:30 PM UTC |
ReviewFindingsHigh
Medium
Low
Info
|
| if parsedCfg, parseErr := config.ParseOrgConfig(cfgData); parseErr == nil { | ||
| for _, agent := range parsedCfg.Agents { | ||
| agentSlugs = append(agentSlugs, agent.Slug) | ||
| } |
There was a problem hiding this comment.
[high] architectural-violation
The change passes enrolledRepos as the disabledRepos parameter to NewEnrollmentLayer during uninstall, but the approved design specification (docs/superpowers/specs/2026-04-18-enrollment-reconciliation-design.md, line 69) explicitly states that runUninstall should pass nil for disabledRepos. The design spec should be updated before or alongside the implementation change.
Suggested fix: Update the design spec to document the rationale for passing enrolled repos during uninstall, then implement the code change. Alternatively, follow the existing spec and pass nil.
| // ConfigRepoLayer deletes the .fullsend repo (layers uninstall in | ||
| // reverse order), so the workflow is still available to dispatch. | ||
| // | ||
| // Errors during unenrollment are non-fatal — the user is informed but |
There was a problem hiding this comment.
[high] scope-creep
The approved design spec (line 62) states Uninstall stays a no-op and the alternatives-considered section (line 100) explicitly rejected Go-side unenrollment in EnrollmentLayer. This PR implements the rejected approach. The approach still dispatches the workflow rather than doing Go-side unenrollment directly, which partially addresses the rejection reason.
Suggested fix: Either revert EnrollmentLayer.Uninstall() to remain a no-op, or update the design spec first to document why the original decision should be reversed.
| l.ui.StepDone("Disabled all repos in config") | ||
|
|
||
| // Dispatch repo-maintenance to create unenrollment PRs. | ||
| dispatchTime := time.Now().UTC().Add(-30 * time.Second) |
There was a problem hiding this comment.
[medium] race condition / ordering
If awaitWorkflowRun times out (~3 minutes), the method logs a warning and returns nil. The workflow may still be running when ConfigRepoLayer.Uninstall subsequently deletes the .fullsend repo, killing the in-progress workflow run and preventing unenrollment PRs from being created.
| @@ -1638,13 +1638,15 @@ func runUninstall(ctx context.Context, client forge.Client, printer *ui.Printer, | |||
| // apps that block reinstallation (PEM keys are one-shot). | |||
| var agentSlugs []string | |||
| var configMode string | |||
There was a problem hiding this comment.
[medium] intent-misalignment
The variable enrolledRepos is extracted via EnabledRepos() (repos with enabled: true) but passed as the disabledRepos parameter. The semantic inversion is not documented.
| // ConfigRepoLayer deletes the .fullsend repo (layers uninstall in | ||
| // reverse order), so the workflow is still available to dispatch. | ||
| // | ||
| // Errors during unenrollment are non-fatal — the user is informed but |
There was a problem hiding this comment.
[medium] coherence-drift
The Uninstall method always returns nil, swallowing all errors as warnings. This bypasses Stack.UninstallAll() error-collection mechanism. The user is informed via StepWarn but the error is not propagated to the caller.
| } | ||
|
|
||
| func TestEnrollmentLayer_Uninstall_Noop(t *testing.T) { | ||
| func TestEnrollmentLayer_Uninstall_NoRepos(t *testing.T) { |
There was a problem hiding this comment.
[low] test adequacy
No test covers the case where enabledRepos is non-empty but disabledRepos is empty (layer constructed with enabled repos but no disabled repos during uninstall).
|
🤖 Finished Retro · ✅ Success · Started 4:26 PM UTC · Completed 4:34 PM UTC |
Retro: PR #2020 —
|
Previously, EnrollmentLayer.Uninstall() was a no-op — enrolled repos kept their shim workflows and .fullsend references after running fullsend admin uninstall. This left stale agent configuration in repos when users later reinstalled with a different set of agents.
The fix makes two changes:
runUninstall() now extracts enabled repos from config.yaml and
passes them as disabledRepos to the EnrollmentLayer.
EnrollmentLayer.Uninstall() now updates config.yaml to mark all
repos as disabled, then dispatches the repo-maintenance workflow
to create unenrollment PRs that remove shim workflows from each
enrolled repo. Errors are non-fatal — the uninstall continues and
the user is informed of any repos needing manual cleanup.
The unenrollment runs before ConfigRepoLayer deletes the .fullsend repo (layers uninstall in reverse order), so the workflow is still available to dispatch.
Note: pre-commit could not run in the sandbox (shellcheck hook failed to install due to network restrictions). The post-script runs pre-commit authoritatively on the runner.
Closes #2017
Post-script verification
agent/2017-uninstall-cleanup-enrollment)5faa79f06ad8dbac4b7f08aa7a6ff79772c74552..HEAD)