Commit dec67e2
feat(azure.ai.agents): add doctor check remote.rbac (P5.1 C16)
Adds the third remote doctor check, `remote.rbac`, implementing
Check Azure#10 of issue Azure#7975 / Check 10 of the design doc
(`azd-ai-agent-doctor-remote-checks.md` lines 159-180): "Developer
has required role on Foundry project".
The check queries the developer's role assignments on the Foundry
project's ARM scope via the Microsoft Graph + ARM stack, then
classifies the result:
* Pass: "<display> has the required role on project '<acct>/<proj>'"
(display name replaced with "the current principal" in the
default redacted mode for UPN safety).
* Fail: templated `az role assignment create --role "Azure AI User"
--assignee <oid> --scope <scope>` + a learn.microsoft.com link
to the RBAC concepts page. Principal ID and scope ARN are
substituted with shell-safe ALL_CAPS placeholders
(OBJECT_ID / PROJECT_SCOPE) in redacted mode — NOT `<redacted>`,
because bash/zsh interpret `<word>` as input redirection.
* Skip: precondition unmet (no AzdClient, no env, auth Failed,
`AZURE_AI_PROJECT_ID` missing/malformed/unset, service-principal
token detected, transient Graph/ARM error, cancellation).
Architecture
------------
Two new files:
internal/project/developer_rbac_query.go (~175 LoC)
- QueryDeveloperRBAC: side-effect-free wrapper returning
a structured DeveloperRBACResult. Reuses package-private
parseAgentIdentityInfo / hasAnyRoleAssignment /
sufficientAIUserRoles from developer_rbac_check.go.
- ValidateProjectResourceID: shape check for AZURE_AI_PROJECT_ID,
returns wrapped ErrInvalidProjectResourceID sentinel.
- ErrInvalidProjectResourceID, ErrSPNDelegatedAuthRequired:
sentinel errors for diagnostic classification.
- Graph /me failure detection: routes app-only/SPN token
rejections onto ErrSPNDelegatedAuthRequired (case-insensitive
message match) so doctor surfaces a SPN-aware Skip.
internal/cmd/doctor/checks_rbac.go (~430 LoC)
- newCheckRBAC: skip-cascade + projectID lookup + upfront
ValidateProjectResourceID gate + probe dispatch.
- classifyRBACProbeError: sentinel-keyed error classification
(Canceled / SPN / InvalidProjectID / generic transient).
- classifyRBACResult: pure Pass/Fail mapping for diagnostic
consumers.
- sanitizeScopeARNs: regex-based scope+GUID scrubber for
probe error text.
- readProjectResourceID: AZURE_AI_PROJECT_ID env lookup via
gRPC EnvironmentService.
- redactID / redactScope / redactDisplay: centralized
placeholder substitution helpers.
Two new seams on Dependencies:
probeDeveloperRBAC - replaces project.QueryDeveloperRBAC
readProjectResourceIDFn - replaces readProjectResourceID
Skip-cascade rationale
----------------------
Per design dependency matrix (line 115), `remote.rbac` cascades
against `local.environment-selected` + `remote.auth` but NOT
`remote.foundry-endpoint`. RBAC reads ARM, not the data plane;
a transient DNS / proxy / outage on the data-plane check must not
prevent the user from learning their role assignment is missing.
`TestCheckRBAC_DoesNotSkipOnFoundryEndpointFail` pins this.
Probe errors → Skip (not Fail) to avoid false alarms on transient
Graph/ARM hiccups. Cancellation similarly Skips with a clean
message rather than rendering an error trace.
Review fixes applied
--------------------
Following the 3-reviewer pass (Opus xhigh + Sonnet 4.6 + GPT-5.5)
of an earlier draft (commit 0c4d5ee), the following findings
were addressed:
* MEDIUM (Opus + GPT-5.5): probe-error path leaked raw
subscription/scope IDs via azcore.ResponseError.Error()'s first
line (`GET https://management.azure.com/subscriptions/...`)
into Message + Details. Fix: sanitizeScopeARNs regex pass
strips ARM scopes + bare GUIDs from Message in redacted mode;
Details["probeError"] is OMITTED entirely unless --unredacted.
* MEDIUM (Sonnet 4.6): malformed AZURE_AI_PROJECT_ID got
"check your network" Suggestion despite no network call.
Fix: upfront ValidateProjectResourceID gate runs before the
probe; surfaces "is not a valid Foundry project ARM resource
ID" with an `azd env set` Suggestion.
* MEDIUM (GPT-5.5): PrincipalDisplay rendered verbatim in
Message even in default redacted mode; display names can
contain UPN fragments (e.g., "Alice (alice@contoso.com)").
Fix: redactedDisplayLabel ("the current principal")
substitutes for raw display in default mode; unredacted mode
still echoes the real display.
* MEDIUM (GPT-5.5): service-principal `azd auth login` cannot
use Graph /me — confusing generic transient Skip. Fix:
ErrSPNDelegatedAuthRequired sentinel + case-insensitive
detection of the canonical "delegated authentication flow"
Graph response; doctor surfaces a SPN-aware Skip with
user-delegated guidance.
* LOW/Nit (Opus): `<redacted>` is a bash input-redirection
token; `--assignee <redacted>` would fail with
`redacted: No such file or directory` on copy-paste. Fix:
shellSafePlaceholderID/Scope constants render `OBJECT_ID` /
`PROJECT_SCOPE` in the templated az command.
Verified clean (no action): skip-cascade structure, firstLine
helper, AZURE_AI_PROJECT_ID provenance (set by
init_foundry_resources_helpers.go:327), seam design, account-scope
check missing (acceptable per `assignedTo()` inheriting parent-
scope assignments).
Testing
-------
22 unit tests in checks_rbac_test.go and developer_rbac_query
covering: skip cascades, probe-error branches (transient, parse,
SPN, defensive sentinel, cancellation), end-to-end probe Pass/
Fail, classifyRBACResult mapping with both redaction modes,
display-name fallback, sensitive-identifier leak prevention,
shell-safe placeholder substitution, ValidateProjectResourceID
shape coverage, sanitizeScopeARNs regex coverage, all three
redaction helpers (redactID/Scope/Display) with empty-input + flag
permutations.
Preflight (from cli/azd/extensions/azure.ai.agents)
---------------------------------------------------
gofmt -s -w . - clean
go vet ./... - clean
go build ./... - clean
go test ./... -count=1 - all packages green
golangci-lint run ./internal/cmd/... - 0 issues
./internal/project/...
npx cspell lint - 0 issues
"internal/cmd/doctor/**/*.go"
"internal/project/developer_rbac_query.go"
Refs: Azure#7975 (PR Azure#8057, Phase 5 / C16)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>1 parent ec6aa9e commit dec67e2
6 files changed
Lines changed: 1441 additions & 18 deletions
File tree
- cli/azd/extensions/azure.ai.agents/internal
- cmd/doctor
- project
Lines changed: 26 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| 13 | + | |
13 | 14 | | |
14 | 15 | | |
15 | 16 | | |
| |||
77 | 78 | | |
78 | 79 | | |
79 | 80 | | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
80 | 106 | | |
81 | 107 | | |
82 | 108 | | |
| |||
0 commit comments