Skip to content

Commit 07970a0

Browse files
Antriksh JainCopilot
andcommitted
feat(azure.ai.agents): add doctor check remote.rbac (P5.1 C16)
Adds the third remote doctor check, `remote.rbac`, implementing Check Azure#10 of issue Azure#7975 / Check 10 of the design doc (`azd-ai-agent-doctor-remote-checks.md` lines 159-180): "Developer has required role on Foundry project". The check queries the developer's role assignments on the Foundry project's ARM scope via the Microsoft Graph + ARM stack, then classifies the result: * Pass: "<display> has the required role on project '<acct>/<proj>'" (display name replaced with "the current principal" in the default redacted mode for UPN safety). * Fail: templated `az role assignment create --role "Azure AI User" --assignee <oid> --scope <scope>` + a learn.microsoft.com link to the RBAC concepts page. Principal ID and scope ARN are substituted with shell-safe ALL_CAPS placeholders (OBJECT_ID / PROJECT_SCOPE) in redacted mode — NOT `<redacted>`, because bash/zsh interpret `<word>` as input redirection. * Skip: precondition unmet (no AzdClient, no env, auth Failed, `AZURE_AI_PROJECT_ID` missing/malformed/unset, service-principal token detected, transient Graph/ARM error, cancellation). Architecture ------------ Two new files: internal/project/developer_rbac_query.go (~175 LoC) - QueryDeveloperRBAC: side-effect-free wrapper returning a structured DeveloperRBACResult. Reuses package-private parseAgentIdentityInfo / hasAnyRoleAssignment / sufficientAIUserRoles from developer_rbac_check.go. - ValidateProjectResourceID: shape check for AZURE_AI_PROJECT_ID, returns wrapped ErrInvalidProjectResourceID sentinel. - ErrInvalidProjectResourceID, ErrSPNDelegatedAuthRequired: sentinel errors for diagnostic classification. - Graph /me failure detection: routes app-only/SPN token rejections onto ErrSPNDelegatedAuthRequired (case-insensitive message match) so doctor surfaces a SPN-aware Skip. internal/cmd/doctor/checks_rbac.go (~430 LoC) - newCheckRBAC: skip-cascade + projectID lookup + upfront ValidateProjectResourceID gate + probe dispatch. - classifyRBACProbeError: sentinel-keyed error classification (Canceled / SPN / InvalidProjectID / generic transient). - classifyRBACResult: pure Pass/Fail mapping for diagnostic consumers. - sanitizeScopeARNs: regex-based scope+GUID scrubber for probe error text. - readProjectResourceID: AZURE_AI_PROJECT_ID env lookup via gRPC EnvironmentService. - redactID / redactScope / redactDisplay: centralized placeholder substitution helpers. Two new seams on Dependencies: probeDeveloperRBAC - replaces project.QueryDeveloperRBAC readProjectResourceIDFn - replaces readProjectResourceID Skip-cascade rationale ---------------------- Per design dependency matrix (line 115), `remote.rbac` cascades against `local.environment-selected` + `remote.auth` but NOT `remote.foundry-endpoint`. RBAC reads ARM, not the data plane; a transient DNS / proxy / outage on the data-plane check must not prevent the user from learning their role assignment is missing. `TestCheckRBAC_DoesNotSkipOnFoundryEndpointFail` pins this. Probe errors → Skip (not Fail) to avoid false alarms on transient Graph/ARM hiccups. Cancellation similarly Skips with a clean message rather than rendering an error trace. Review fixes applied -------------------- Following the 3-reviewer pass (Opus xhigh + Sonnet 4.6 + GPT-5.5) of an earlier draft (commit 0c4d5ee), the following findings were addressed: * MEDIUM (Opus + GPT-5.5): probe-error path leaked raw subscription/scope IDs via azcore.ResponseError.Error()'s first line (`GET https://management.azure.com/subscriptions/...`) into Message + Details. Fix: sanitizeScopeARNs regex pass strips ARM scopes + bare GUIDs from Message in redacted mode; Details["probeError"] is OMITTED entirely unless --unredacted. * MEDIUM (Sonnet 4.6): malformed AZURE_AI_PROJECT_ID got "check your network" Suggestion despite no network call. Fix: upfront ValidateProjectResourceID gate runs before the probe; surfaces "is not a valid Foundry project ARM resource ID" with an `azd env set` Suggestion. * MEDIUM (GPT-5.5): PrincipalDisplay rendered verbatim in Message even in default redacted mode; display names can contain UPN fragments (e.g., "Alice (alice@contoso.com)"). Fix: redactedDisplayLabel ("the current principal") substitutes for raw display in default mode; unredacted mode still echoes the real display. * MEDIUM (GPT-5.5): service-principal `azd auth login` cannot use Graph /me — confusing generic transient Skip. Fix: ErrSPNDelegatedAuthRequired sentinel + case-insensitive detection of the canonical "delegated authentication flow" Graph response; doctor surfaces a SPN-aware Skip with user-delegated guidance. * LOW/Nit (Opus): `<redacted>` is a bash input-redirection token; `--assignee <redacted>` would fail with `redacted: No such file or directory` on copy-paste. Fix: shellSafePlaceholderID/Scope constants render `OBJECT_ID` / `PROJECT_SCOPE` in the templated az command. Verified clean (no action): skip-cascade structure, firstLine helper, AZURE_AI_PROJECT_ID provenance (set by init_foundry_resources_helpers.go:327), seam design, account-scope check missing (acceptable per `assignedTo()` inheriting parent- scope assignments). Testing ------- 22 unit tests in checks_rbac_test.go and developer_rbac_query covering: skip cascades, probe-error branches (transient, parse, SPN, defensive sentinel, cancellation), end-to-end probe Pass/ Fail, classifyRBACResult mapping with both redaction modes, display-name fallback, sensitive-identifier leak prevention, shell-safe placeholder substitution, ValidateProjectResourceID shape coverage, sanitizeScopeARNs regex coverage, all three redaction helpers (redactID/Scope/Display) with empty-input + flag permutations. Preflight (from cli/azd/extensions/azure.ai.agents) --------------------------------------------------- gofmt -s -w . - clean go vet ./... - clean go build ./... - clean go test ./... -count=1 - all packages green golangci-lint run ./internal/cmd/... - 0 issues ./internal/project/... npx cspell lint - 0 issues "internal/cmd/doctor/**/*.go" "internal/project/developer_rbac_query.go" Refs: Azure#7975 (PR Azure#8057, Phase 5 / C16) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent b35ebef commit 07970a0

6 files changed

Lines changed: 1441 additions & 18 deletions

File tree

cli/azd/extensions/azure.ai.agents/internal/cmd/doctor/checks_local.go

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ import (
1010
"strings"
1111

1212
"azureaiagent/internal/cmd/nextstep"
13+
"azureaiagent/internal/project"
1314

1415
"github.com/azure/azure-dev/cli/azd/pkg/azdext"
1516
"google.golang.org/grpc/codes"
@@ -77,6 +78,31 @@ type Dependencies struct {
7778
// `local.project-endpoint-set` check; production wiring leaves
7879
// this field nil.
7980
probeFoundryEndpoint func(ctx context.Context, endpoint string) foundryProbeResult
81+
82+
// probeDeveloperRBAC is a test seam: when non-nil it replaces
83+
// the production `project.QueryDeveloperRBAC` call inside the
84+
// `remote.rbac` check, letting unit tests cover the Pass / Fail /
85+
// transient-error branches without spinning up Graph + ARM
86+
// fakes. The signature mirrors `project.QueryDeveloperRBAC`
87+
// exactly so the wiring inside `newCheckRBAC` is a single
88+
// `if probe == nil { probe = project.QueryDeveloperRBAC }`
89+
// substitution.
90+
probeDeveloperRBAC func(
91+
ctx context.Context,
92+
azdClient *azdext.AzdClient,
93+
projectResourceID string,
94+
) (*project.DeveloperRBACResult, error)
95+
96+
// readProjectResourceIDFn is a test seam: when non-nil it
97+
// replaces the production `readProjectResourceID` call inside
98+
// the `remote.rbac` check, letting unit tests exercise the
99+
// downstream probe-injection paths (Pass / Fail / cascade)
100+
// without instantiating a real gRPC AzdClient just to read one
101+
// env var. Production wiring leaves this nil.
102+
readProjectResourceIDFn func(
103+
ctx context.Context,
104+
azdClient *azdext.AzdClient,
105+
) (string, error)
80106
}
81107

82108
// NewLocalChecks returns the canonical sequence of local doctor checks

0 commit comments

Comments
 (0)