Skip to content

Commit d422395

Browse files
Antriksh JainCopilot
andcommitted
feat(azure.ai.agents): wire azd ai agent doctor command
Lands Phase 4.4 of PR Azure#8057: the user-facing `azd ai agent doctor` command. Wires the existing doctor package (runner + 6 local checks from 4.3 / 4.3.1) into Cobra, adds the text/JSON formatters, and emits a TTY-gated trailing Next: block when all checks pass. What the user sees $ azd ai agent doctor azd ai agent doctor ✓ PASS azd extension reachable azd extension reachable (version 0.1.29-preview). ✓ PASS azure.yaml present and parseable azure.yaml parsed (project: <name>). [...] ✓ PASS agent.yaml valid (per service) agent.yaml valid for 1 service(s) Summary: 6 passed, 0 failed, 0 skipped, 0 warned Next: azd ai agent invoke <agent> '{"message": "Hello!"}' ↳ invoke the deployed agent $ azd ai agent doctor --output json {"schemaVersion":"1.0","remote":false,"redacted":true,"checks":[...]} $ azd ai agent doctor --output yaml ERROR: invalid --output value "yaml" (must be 'text' or 'json') Architecture - Formatters live in the cmd package, not the doctor subpackage. The doctor package owns checks + runner only; its types.go package doc (lines 17-19) explicitly places Cobra wiring and IO in the parent package. - doctor.go is the Cobra factory + flag-handling layer; runDoctor is the testable core (no Cobra ref). - Exit codes use os.Exit direct (azdext.Run only emits 0 / 1; the runner's 3-state contract requires explicit os.Exit(2) for the all-skip case). - Trailing Next: block: only on exit code 0 (passes + no fails), only in text output, only on TTY. JSON envelope deliberately excludes it per the design spec. - Branch on services: any IsDeployed → filtered ResolveAfterDeploy; else → ResolveAfterInit. Filtering required because ResolveAfterDeploy emits show+invoke for every state.Service unconditionally — mixed deployed/undeployed projects would otherwise emit broken commands. What's new - internal/cmd/doctor.go (+250) — Cobra factory, doctorFlags, validateDoctorFlags, runDoctor, resolveDoctorTrailing, helpers (anyServiceDeployed, filterDeployedServices, doctorCachedPayload, doctorReadmeExists). - internal/cmd/doctor_format.go (+170) — renderDoctorReport (output dispatcher), printDoctorReportJSON (envelope), printDoctorReportText (per-check + summary + trailing Next:), statusGlyphAndLabel (✓/✗/!/-/?). - internal/cmd/doctor_format_test.go (+290) — 32 subtests covering JSON envelope shape, text rendering for pass/fail/skip mixes, trailing block gating, output-flag routing, glyph mapping, flag validation, deployed-service filtering. - internal/cmd/root.go (+1) — rootCmd.AddCommand(newDoctorCommand()). - cli/azd/.vscode/cspell.yaml (+9) — file-scoped overrides for doctor.go (nextsteps, undeployed) and doctor_format.go (nextsteps, UNKN), following the existing per-file override convention. Pre-flight ✓ gofmt -s clean ✓ go vet clean ✓ go build clean ✓ Full extension cmd tests pass (13.5s) ✓ doctor + nextstep tests pass ✓ golangci-lint 0 issues ✓ cspell 0 issues on new files Live smoke against the deployed hello-world-python-invocations sample ✓ `azd ai agent doctor` → 6 PASS, exit 0, raw bytes show correct \r\n\r\n separator between checks and Summary (PowerShell mojibake on ✓ glyph is console-encoding only; the bytes are valid UTF-8) ✓ `azd ai agent doctor --output json` → well-formed envelope with schemaVersion 1.0, all 6 checks, no nextStep field ✓ `azd ai agent doctor --output yaml` → exit 1 + clear validation error ✓ `azd ai agent doctor --help` → full help text with three flag explanations and exit-code table Not in scope (deferred to Phase 5) - --local-only is a no-op; every shipped check is local today. The flag is exposed early so the Cobra surface locks without churn when remote checks land. - --unredacted is reserved for the remote-checks pass. - Trailing Next: cachedPayload / readmeExists closures pull from the agent.yaml service path and the .azure/<env> directory; further refinements (e.g., README detection priority, cross-platform path handling) can land in a follow-up if smoke testing surfaces issues. References - PR Azure#8057 design spec section "Phase 4 — doctor command, local checks 1–6" + "Doctor output shape" + "Exit codes & JSON output". - doctor package contract: internal/cmd/doctor/types.go, runner.go (ExitCode at lines 171-179). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 7567546 commit d422395

5 files changed

Lines changed: 828 additions & 0 deletions

File tree

cli/azd/.vscode/cspell.yaml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -412,6 +412,14 @@ overrides:
412412
- filename: "**/extensions/azure.ai.agents/internal/cmd/doctor/types.go"
413413
words:
414414
- nextsteps
415+
- filename: "**/extensions/azure.ai.agents/internal/cmd/doctor.go"
416+
words:
417+
- nextsteps
418+
- undeployed
419+
- filename: "**/extensions/azure.ai.agents/internal/cmd/doctor_format.go"
420+
words:
421+
- nextsteps
422+
- UNKN
415423
- filename: docs/code-coverage-guide.md
416424
words:
417425
- covdata
Lines changed: 297 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,297 @@
1+
// Copyright (c) Microsoft Corporation. All rights reserved.
2+
// Licensed under the MIT License.
3+
4+
package cmd
5+
6+
import (
7+
"context"
8+
"fmt"
9+
"os"
10+
"path/filepath"
11+
12+
"azureaiagent/internal/cmd/doctor"
13+
"azureaiagent/internal/cmd/nextstep"
14+
"azureaiagent/internal/version"
15+
16+
"github.com/azure/azure-dev/cli/azd/pkg/azdext"
17+
"github.com/spf13/cobra"
18+
)
19+
20+
// doctorFlags are the Cobra-bound flags for `azd ai agent doctor`.
21+
//
22+
// localOnly is exposed today as a no-op: every shipped check is local
23+
// (Phase 4 covers checks 1–6). The Cobra surface is locked early so the
24+
// Phase 5 follow-up that adds remote checks does not need to introduce
25+
// the flag in the same commit as the new check implementations.
26+
//
27+
// output selects the rendering path: "text" (default, human-readable
28+
// with a trailing Next: block on success) or "json" (structured envelope
29+
// for scripted consumers).
30+
//
31+
// unredacted is reserved for Phase 5 — once remote checks surface
32+
// principal IDs, scope ARNs, and UPNs, this flag will toggle the
33+
// redaction layer. It is bound today and threaded into doctor.Options
34+
// so that callers (and tests) can already exercise the wire without
35+
// the future Phase 5 fix-up touching the Cobra surface.
36+
type doctorFlags struct {
37+
localOnly bool
38+
output string
39+
unredacted bool
40+
}
41+
42+
func newDoctorCommand() *cobra.Command {
43+
flags := &doctorFlags{}
44+
45+
cmd := &cobra.Command{
46+
Use: "doctor",
47+
Short: "Diagnose problems with an azd ai agent project.",
48+
Long: `Diagnose problems with an azd ai agent project.
49+
50+
Runs a sequence of local checks against the current azd project,
51+
reporting on each one and (when all checks pass) suggesting the next
52+
command to run. Use this when you have lost terminal context or hit a
53+
confusing error and want a complete picture of the project's state.
54+
55+
Exit codes:
56+
0 — at least one check passed and no checks failed
57+
1 — any check failed
58+
2 — all checks were skipped (e.g. preconditions unmet)`,
59+
Example: ` # Run the full check suite with human-readable output
60+
azd ai agent doctor
61+
62+
# Emit a structured JSON envelope (for scripts / CI)
63+
azd ai agent doctor --output json`,
64+
Args: cobra.NoArgs,
65+
RunE: func(cmd *cobra.Command, args []string) error {
66+
if err := validateDoctorFlags(flags); err != nil {
67+
return err
68+
}
69+
70+
ctx := azdext.WithAccessToken(cmd.Context())
71+
logCleanup := setupDebugLogging(cmd.Flags())
72+
defer logCleanup()
73+
74+
// NewAzdClient errors are not fatal — the gRPC check
75+
// (`local.grpc-extension`) surfaces the failure verbatim
76+
// to the user, and downstream checks Skip cleanly when
77+
// the client is nil. We deliberately do NOT short-circuit
78+
// the command here.
79+
azdClient, clientErr := azdext.NewAzdClient()
80+
if azdClient != nil {
81+
defer azdClient.Close()
82+
}
83+
84+
deps := doctor.Dependencies{
85+
AzdClient: azdClient,
86+
AzdClientErr: clientErr,
87+
ExtensionVersion: version.Version,
88+
}
89+
90+
opts := doctor.Options{
91+
LocalOnly: flags.localOnly,
92+
Unredacted: flags.unredacted,
93+
}
94+
95+
report, trailing := runDoctor(ctx, deps, opts, azdClient)
96+
if err := renderDoctorReport(os.Stdout, flags.output, report, trailing); err != nil {
97+
return err
98+
}
99+
100+
// Exit codes are part of the doctor contract (see design
101+
// `docs/design/azd-ai-agent-nextsteps.md`, "Exit codes &
102+
// JSON output"). Cobra/azdext maps a nil return to exit 0
103+
// and any non-nil return to exit 1, which collapses our
104+
// three-state contract into a two-state one. We call
105+
// os.Exit directly to preserve the 0/1/2 distinction.
106+
// Defers above run via the explicit Close + flushed
107+
// stdout writer; nothing else needs cleanup before exit.
108+
code := doctor.ExitCode(report)
109+
if code == 0 {
110+
return nil
111+
}
112+
os.Exit(code)
113+
return nil // unreachable
114+
},
115+
}
116+
117+
cmd.Flags().BoolVar(
118+
&flags.localOnly, "local-only", false,
119+
"Run only local checks (no network calls). "+
120+
"All checks are local today; this flag is reserved for an upcoming remote-checks pass.",
121+
)
122+
cmd.Flags().StringVarP(
123+
&flags.output, "output", "o", "text",
124+
"Output format (text or json).",
125+
)
126+
cmd.Flags().BoolVar(
127+
&flags.unredacted, "unredacted", false,
128+
"Show raw principal IDs, scope ARNs, and UPNs in the report. "+
129+
"Reserved for the upcoming remote-checks pass (no-op today).",
130+
)
131+
132+
return cmd
133+
}
134+
135+
// validateDoctorFlags enforces the closed set of values for --output. We
136+
// validate before any work so an obvious typo (`--output yaml`) does not
137+
// run the entire check suite only to print nothing useful.
138+
func validateDoctorFlags(flags *doctorFlags) error {
139+
switch flags.output {
140+
case "text", "json":
141+
return nil
142+
default:
143+
return fmt.Errorf("invalid --output value %q (must be 'text' or 'json')", flags.output)
144+
}
145+
}
146+
147+
// runDoctor is the testable core of the doctor command. It constructs a
148+
// Runner from the configured checks, executes it, and (when the report
149+
// is clean) resolves a trailing Next: block via the nextstep resolver.
150+
//
151+
// The trailing block is computed unconditionally but only rendered by
152+
// the text formatter — the JSON envelope deliberately excludes it (see
153+
// design spec, "Exit codes & JSON output"). Computing it here keeps the
154+
// expensive bit (gRPC round-trip in AssembleStateFromSource) out of the
155+
// formatter and lets tests assert the resolver branch by inspection.
156+
//
157+
// azdClient may be nil when NewAzdClient failed at startup; in that
158+
// case the trailing block is skipped (resolver has no state to work
159+
// with). The function never returns an error: every failure mode is
160+
// captured in the Report or in a skipped trailing block.
161+
func runDoctor(
162+
ctx context.Context,
163+
deps doctor.Dependencies,
164+
opts doctor.Options,
165+
azdClient *azdext.AzdClient,
166+
) (doctor.Report, []nextstep.Suggestion) {
167+
runner := doctor.Runner{Checks: doctor.NewLocalChecks(deps)}
168+
report := runner.Run(ctx, opts)
169+
170+
// Trailing Next: block is only meaningful when checks all pass
171+
// (exit code 0). On Fail or all-skip, the user's next move is to
172+
// fix the surfaced problem — burying that under "Next: azd deploy"
173+
// would be noise. Locked by the design spec at
174+
// `docs/design/azd-ai-agent-nextsteps.md`, "Doctor output shape":
175+
// "When all checks pass, the trailing Next: block is ...".
176+
if doctor.ExitCode(report) != 0 {
177+
return report, nil
178+
}
179+
180+
trailing := resolveDoctorTrailing(ctx, azdClient)
181+
return report, trailing
182+
}
183+
184+
// resolveDoctorTrailing assembles state from the azd gRPC channel and
185+
// asks the nextstep resolver for the doctor's trailing block.
186+
// Returns nil on any error — the trailing block is a courtesy, not a
187+
// load-bearing surface, and the body of the doctor report already
188+
// tells the user what to do.
189+
//
190+
// Branch selection:
191+
// - Any service in azure.yaml has IsDeployed == true →
192+
// ResolveAfterDeploy (filtered to deployed services). The resolver
193+
// emits show + invoke for each deployed agent.
194+
// - No service deployed → ResolveAfterInit. Same block the user saw
195+
// at the end of `azd ai agent init`, which guides them toward
196+
// `azd provision` / `azd ai agent run` / `azd deploy`.
197+
func resolveDoctorTrailing(ctx context.Context, azdClient *azdext.AzdClient) []nextstep.Suggestion {
198+
if azdClient == nil {
199+
return nil
200+
}
201+
202+
state, _ := nextstep.AssembleStateFromSource(ctx, nextstep.NewSource(azdClient))
203+
if len(state.Services) == 0 {
204+
// Healthy project but no agent services in azure.yaml — the
205+
// init resolver still produces a useful "run azd ai agent
206+
// init" hint via its empty-services branch, but for doctor
207+
// the body of the report already covered that via the
208+
// `local.agent-service-detected` check. Emitting the same
209+
// hint twice is noise.
210+
return nil
211+
}
212+
213+
if anyServiceDeployed(state.Services) {
214+
filtered := filterDeployedServices(state)
215+
return nextstep.ResolveAfterDeploy(
216+
filtered,
217+
doctorCachedPayload(ctx, azdClient),
218+
doctorReadmeExists(ctx, azdClient),
219+
)
220+
}
221+
222+
return nextstep.ResolveAfterInit(state)
223+
}
224+
225+
func anyServiceDeployed(services []nextstep.ServiceState) bool {
226+
for _, s := range services {
227+
if s.IsDeployed {
228+
return true
229+
}
230+
}
231+
return false
232+
}
233+
234+
// filterDeployedServices returns a shallow clone of state whose Services
235+
// list contains only the entries with IsDeployed == true. The clone is
236+
// necessary because ResolveAfterDeploy emits one show + one invoke
237+
// per Service it sees; passing an unfiltered state would produce
238+
// `azd ai agent invoke <undeployed-service>` lines, which 404.
239+
func filterDeployedServices(state *nextstep.State) *nextstep.State {
240+
if state == nil {
241+
return nil
242+
}
243+
clone := *state
244+
clone.Services = make([]nextstep.ServiceState, 0, len(state.Services))
245+
for _, s := range state.Services {
246+
if s.IsDeployed {
247+
clone.Services = append(clone.Services, s)
248+
}
249+
}
250+
return &clone
251+
}
252+
253+
// doctorCachedPayload returns a cachedPayload closure for
254+
// ResolveAfterDeploy. It looks up the cached remote OpenAPI spec (the
255+
// one populated by prior `azd ai agent invoke` runs) and extracts a
256+
// sample payload via ExtractInvokeExample. Returns "" on any failure
257+
// so the resolver falls back to its protocol-generic literal.
258+
//
259+
// Suffix is "remote" because doctor's trailing block emits commands
260+
// for the deployed agent (`azd ai agent invoke <agent>`); the local
261+
// cache (suffix "local") is from `azd ai agent invoke --local` and is
262+
// not appropriate here.
263+
func doctorCachedPayload(ctx context.Context, azdClient *azdext.AzdClient) func(string) string {
264+
return func(serviceName string) string {
265+
if azdClient == nil || serviceName == "" {
266+
return ""
267+
}
268+
configPath, err := resolveConfigPath(ctx, azdClient)
269+
if err != nil {
270+
return ""
271+
}
272+
spec, err := nextstep.ReadCachedOpenAPISpec(filepath.Dir(configPath), serviceName, "remote")
273+
if err != nil {
274+
return ""
275+
}
276+
return nextstep.ExtractInvokeExample(spec)
277+
}
278+
}
279+
280+
// doctorReadmeExists returns a readmeExists closure for
281+
// ResolveAfterDeploy. The closure resolves the project root once
282+
// (cached across calls) and reports whether
283+
// <projectRoot>/<relativePath>/README.md exists.
284+
//
285+
// Only the canonical "README.md" casing is checked, matching the
286+
// rendered "see <relPath>/README.md" line; accepting other casings
287+
// would yield a broken pointer on case-sensitive filesystems.
288+
func doctorReadmeExists(ctx context.Context, azdClient *azdext.AzdClient) func(string) bool {
289+
projectRoot := resolveProjectPath(ctx, azdClient)
290+
return func(relativePath string) bool {
291+
if projectRoot == "" || relativePath == "" {
292+
return false
293+
}
294+
_, err := os.Stat(filepath.Join(projectRoot, relativePath, "README.md"))
295+
return err == nil
296+
}
297+
}

0 commit comments

Comments
 (0)