Skip to content

Commit ae4dd4b

Browse files
Antriksh JainCopilot
andcommitted
feat(azure.ai.agents): add doctor local checks 4-6 (agent service, project endpoint, agent.yaml)
Phase 4.3 of PR Azure#8057. Extends the doctor package with the last three MVP local checks, completing the local-checks slate the spec calls for. What the three new checks do (in plain English): - local.agent-service-detected — re-fetches the project config and counts services whose `host` is `azure.ai.agent`. Passes with the count and sorted service names so the user can verify at a glance what the doctor saw. Fails with "Run `azd ai agent init`" when no agent service is configured. Skip-cascades off local.azure-yaml. - local.project-endpoint-set — reads AZURE_AI_PROJECT_ENDPOINT via the EnvironmentService gRPC. Empty EnvName defaults to the current azd environment, so this check does not need to re-resolve the env name. Fails with "Run `azd provision` ... or `azd env set ...`" when the value is missing/whitespace-only. Skip-cascades off local.environment-selected. - local.agent-yaml-valid — for each agent service in azure.yaml, reads <projectPath>/<svc.RelativePath>/agent.yaml and parses it as agent_yaml.ContainerAgent. Collects ALL failures rather than short-circuiting so multi-service projects get one actionable report listing every offending service. Skip-cascades off local.agent-service-detected. Architectural notes: - Local `agentHost = "azure.ai.agent"` constant mirrors cmd.AiAgentHost (init.go:113) and nextstep.agentHost (state.go:28). The doctor package cannot import cmd (cmd will import doctor for Cobra wiring in Phase 4.4, which would form a cycle). - protobuf `Services` is a map, so iteration order is non-deterministic. Both checks 4 and 6 sort by service name before formatting messages and Details, so output is reproducible across runs (and across goroutines once the runner gains concurrency in Phase 5). - Transport-error suggestion swap (Phase 4.2.1's isTransportFailure) applies to all three new checks, matching the pattern established in checks 2 and 3. - `priorFailed(prior, id)` is a small new helper used by all three cascades. The Phase 4.2 checks (1-3) inline their own skip logic because they don't have any predecessors — extracting them is a separate refactor candidate, not in scope here. Files changed: - internal/cmd/doctor/checks_local.go — NewLocalChecks now returns 6 entries (3 → 6) in the canonical execution order. - internal/cmd/doctor/checks_project.go — new file. Three Check factories, `validateAgentYAML` helper, `priorFailed` helper, and the `agentHost` / `projectEndpointVar` constants. - internal/cmd/doctor/checks_local_test.go — `fakeEnvironmentServer` gains `valueResp` / `valueErr` fields and a `GetValue` method (Phase 4.3 check 5 needs it). `TestNewLocalChecks_OrderAndIDs` updated for the new 3 → 6 size and ordering. - internal/cmd/doctor/checks_project_test.go — new file. ~25 test cases across all three checks: cascade-skip behavior, transport- error suggestion swap, nil-response handling, malformed-yaml, missing-file, mixed valid+invalid, multi-agent ordering. Real temp-dir agent.yaml files for check 6 (t.TempDir() + writeYAML helper). Pre-flight: gofmt clean, go vet clean, go build clean, doctor tests 10.1s (38 tests, all green), full extension test suite green, golangci-lint 0 issues, cspell 0 issues, go fix no-op. No live smoke yet — doctor command is not Cobra-wired until Phase 4.4. Logic is locked by unit tests at the check level. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 493e5fa commit ae4dd4b

4 files changed

Lines changed: 813 additions & 3 deletions

File tree

cli/azd/extensions/azure.ai.agents/internal/cmd/doctor/checks_local.go

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,13 +42,17 @@ type Dependencies struct {
4242
}
4343

4444
// NewLocalChecks returns the canonical sequence of local doctor checks
45-
// in execution order. Phase 4.2 covers checks 1-3; phase 4.3 will append
46-
// checks 4-6 (agent service, project endpoint, agent.yaml).
45+
// in execution order. Phase 4.2 covered checks 1-3; Phase 4.3 adds
46+
// checks 4-6 (agent service detected, project endpoint set, agent.yaml
47+
// valid).
4748
func NewLocalChecks(deps Dependencies) []Check {
4849
return []Check{
4950
newCheckGRPCAndVersion(deps),
5051
newCheckProjectConfig(deps),
5152
newCheckEnvironmentSelected(deps),
53+
newCheckAgentServiceDetected(deps),
54+
newCheckProjectEndpointSet(deps),
55+
newCheckAgentYAMLValid(deps),
5256
}
5357
}
5458

cli/azd/extensions/azure.ai.agents/internal/cmd/doctor/checks_local_test.go

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,10 @@ type fakeEnvironmentServer struct {
3737
azdext.UnimplementedEnvironmentServiceServer
3838
resp *azdext.EnvironmentResponse
3939
err error
40+
41+
// GetValue stub fields (Phase 4.3).
42+
valueResp *azdext.KeyValueResponse
43+
valueErr error
4044
}
4145

4246
func (s *fakeEnvironmentServer) GetCurrent(
@@ -48,6 +52,15 @@ func (s *fakeEnvironmentServer) GetCurrent(
4852
return s.resp, nil
4953
}
5054

55+
func (s *fakeEnvironmentServer) GetValue(
56+
context.Context, *azdext.GetEnvRequest,
57+
) (*azdext.KeyValueResponse, error) {
58+
if s.valueErr != nil {
59+
return nil, s.valueErr
60+
}
61+
return s.valueResp, nil
62+
}
63+
5164
// newTestAzdClient spins up an in-process gRPC server with the supplied
5265
// Project + Environment server stubs and returns a client wired to its
5366
// address. The server, listener, and client are all torn down via
@@ -430,7 +443,7 @@ func TestNewLocalChecks_OrderAndIDs(t *testing.T) {
430443
t.Parallel()
431444

432445
checks := NewLocalChecks(Dependencies{})
433-
require.Len(t, checks, 3)
446+
require.Len(t, checks, 6)
434447

435448
want := []struct {
436449
id string
@@ -440,6 +453,9 @@ func TestNewLocalChecks_OrderAndIDs(t *testing.T) {
440453
{"local.grpc-extension", "azd extension reachable", false},
441454
{"local.azure-yaml", "azure.yaml present and parseable", false},
442455
{"local.environment-selected", "azd environment selected", false},
456+
{"local.agent-service-detected", "agent service in azure.yaml", false},
457+
{"local.project-endpoint-set", "AZURE_AI_PROJECT_ENDPOINT set", false},
458+
{"local.agent-yaml-valid", "agent.yaml valid (per service)", false},
443459
}
444460
for i, w := range want {
445461
require.Equal(t, w.id, checks[i].ID, "index %d", i)
Lines changed: 279 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,279 @@
1+
// Copyright (c) Microsoft Corporation. All rights reserved.
2+
// Licensed under the MIT License.
3+
4+
package doctor
5+
6+
import (
7+
"context"
8+
"fmt"
9+
"os"
10+
"path/filepath"
11+
"sort"
12+
"strings"
13+
14+
"azureaiagent/internal/pkg/agents/agent_yaml"
15+
16+
"github.com/azure/azure-dev/cli/azd/pkg/azdext"
17+
"gopkg.in/yaml.v3"
18+
)
19+
20+
// agentHost is the value used in azure.yaml for an azure.ai.agent service.
21+
// Must stay in sync with cmd.AiAgentHost ("azure.ai.agent") in
22+
// `internal/cmd/init.go`; duplicated here so the doctor package does not
23+
// have to import cmd (which would form an import cycle once the Cobra
24+
// wiring lands in Phase 4.4).
25+
const agentHost = "azure.ai.agent"
26+
27+
// projectEndpointVar is the azd environment variable that points at the
28+
// Foundry project. Must stay in sync with the rest of the extension
29+
// (`agent_context.go`, `listen.go`, `service_target_agent.go`).
30+
const projectEndpointVar = "AZURE_AI_PROJECT_ENDPOINT"
31+
32+
// newCheckAgentServiceDetected produces Check `local.agent-service-detected`.
33+
// It re-fetches the project config and counts services whose `host` is
34+
// `azure.ai.agent`. Pass surfaces the count and service names so users
35+
// can verify the doctor saw what they expected; Fail tells them to run
36+
// `azd ai agent init` to scaffold one. Skips when the gRPC client is
37+
// unavailable or when `local.azure-yaml` failed.
38+
func newCheckAgentServiceDetected(deps Dependencies) Check {
39+
return Check{
40+
ID: "local.agent-service-detected",
41+
Name: "agent service in azure.yaml",
42+
Fn: func(ctx context.Context, _ Options, prior []Result) Result {
43+
if deps.AzdClient == nil {
44+
return Result{Status: StatusSkip, Message: "skipped: azd extension not reachable"}
45+
}
46+
if priorFailed(prior, "local.azure-yaml") {
47+
return Result{Status: StatusSkip, Message: "skipped: azure.yaml check failed"}
48+
}
49+
50+
resp, err := deps.AzdClient.Project().Get(ctx, &azdext.EmptyRequest{})
51+
if err != nil {
52+
suggestion := "Run `azd ai agent init` to add an azure.ai.agent service to azure.yaml."
53+
if isTransportFailure(err) {
54+
suggestion = "Re-run via `azd ai agent doctor`; the extension cannot reach azd's gRPC channel."
55+
}
56+
return Result{
57+
Status: StatusFail,
58+
Message: fmt.Sprintf("failed to get project config: %v", err),
59+
Suggestion: suggestion,
60+
}
61+
}
62+
if resp == nil || resp.Project == nil {
63+
return Result{
64+
Status: StatusFail,
65+
Message: "failed to get project config (is there an azure.yaml?)",
66+
Suggestion: "Run from a directory containing `azure.yaml`, or initialize one with `azd init`.",
67+
}
68+
}
69+
70+
var agentServices []string
71+
for _, s := range resp.Project.Services {
72+
if s != nil && s.Host == agentHost {
73+
agentServices = append(agentServices, s.Name)
74+
}
75+
}
76+
// Sort for deterministic display: protobuf Services is a map,
77+
// so iteration order is unstable across runs.
78+
sort.Strings(agentServices)
79+
if len(agentServices) == 0 {
80+
return Result{
81+
Status: StatusFail,
82+
Message: "no `azure.ai.agent` service found in azure.yaml",
83+
Suggestion: "Run `azd ai agent init` to add an azure.ai.agent service to azure.yaml.",
84+
}
85+
}
86+
return Result{
87+
Status: StatusPass,
88+
Message: fmt.Sprintf(
89+
"%d agent service(s) in azure.yaml: %s",
90+
len(agentServices), strings.Join(agentServices, ", ")),
91+
Details: map[string]any{
92+
"agentServices": agentServices,
93+
"agentServiceCount": len(agentServices),
94+
},
95+
}
96+
},
97+
}
98+
}
99+
100+
// newCheckProjectEndpointSet produces Check `local.project-endpoint-set`.
101+
// It reads `AZURE_AI_PROJECT_ENDPOINT` from the currently-selected azd
102+
// environment via the EnvironmentService gRPC. An empty EnvName in
103+
// GetEnvRequest defaults to the current environment, so this check does
104+
// not need to re-resolve the environment name itself.
105+
//
106+
// Skips when the gRPC client is unavailable or when
107+
// `local.environment-selected` failed. Fails when the value is missing
108+
// or empty, telling users to run `azd provision` (the production path)
109+
// or `azd env set` (for pointing at an existing project).
110+
func newCheckProjectEndpointSet(deps Dependencies) Check {
111+
return Check{
112+
ID: "local.project-endpoint-set",
113+
Name: "AZURE_AI_PROJECT_ENDPOINT set",
114+
Fn: func(ctx context.Context, _ Options, prior []Result) Result {
115+
if deps.AzdClient == nil {
116+
return Result{Status: StatusSkip, Message: "skipped: azd extension not reachable"}
117+
}
118+
if priorFailed(prior, "local.environment-selected") {
119+
return Result{Status: StatusSkip, Message: "skipped: environment check failed"}
120+
}
121+
122+
resp, err := deps.AzdClient.Environment().GetValue(ctx, &azdext.GetEnvRequest{
123+
Key: projectEndpointVar,
124+
})
125+
if err != nil {
126+
suggestion := fmt.Sprintf(
127+
"Run `azd provision` to create the Foundry project, or `azd env set %s <https://...>` to point at an existing one.",
128+
projectEndpointVar)
129+
if isTransportFailure(err) {
130+
suggestion = "Re-run via `azd ai agent doctor`; the extension cannot reach azd's gRPC channel."
131+
}
132+
return Result{
133+
Status: StatusFail,
134+
Message: fmt.Sprintf("failed to read %s: %v", projectEndpointVar, err),
135+
Suggestion: suggestion,
136+
}
137+
}
138+
if resp == nil || strings.TrimSpace(resp.Value) == "" {
139+
return Result{
140+
Status: StatusFail,
141+
Message: fmt.Sprintf("%s is not set in the current azd environment", projectEndpointVar),
142+
Suggestion: fmt.Sprintf(
143+
"Run `azd provision` to create the Foundry project, or `azd env set %s <https://...>` to point at an existing one.",
144+
projectEndpointVar),
145+
}
146+
}
147+
return Result{
148+
Status: StatusPass,
149+
Message: fmt.Sprintf("%s = %s", projectEndpointVar, resp.Value),
150+
Details: map[string]any{
151+
"projectEndpoint": resp.Value,
152+
},
153+
}
154+
},
155+
}
156+
}
157+
158+
// newCheckAgentYAMLValid produces Check `local.agent-yaml-valid`. For
159+
// each agent service in azure.yaml, it reads `<projectPath>/<svc.RelativePath>/agent.yaml`
160+
// and parses it as `agent_yaml.ContainerAgent`. Fails when any service's
161+
// file is missing, unreadable, or fails to parse — collecting all errors
162+
// rather than short-circuiting so multi-service projects get a single
163+
// actionable report.
164+
//
165+
// Skips when the gRPC client is unavailable or when
166+
// `local.agent-service-detected` failed (no services to validate). The
167+
// suggestion mirrors the spec's "fix YAML" guidance.
168+
func newCheckAgentYAMLValid(deps Dependencies) Check {
169+
return Check{
170+
ID: "local.agent-yaml-valid",
171+
Name: "agent.yaml valid (per service)",
172+
Fn: func(ctx context.Context, _ Options, prior []Result) Result {
173+
if deps.AzdClient == nil {
174+
return Result{Status: StatusSkip, Message: "skipped: azd extension not reachable"}
175+
}
176+
if priorFailed(prior, "local.agent-service-detected") {
177+
return Result{Status: StatusSkip, Message: "skipped: no agent services detected"}
178+
}
179+
180+
resp, err := deps.AzdClient.Project().Get(ctx, &azdext.EmptyRequest{})
181+
if err != nil {
182+
suggestion := "Run from a directory containing `azure.yaml`, or initialize one with `azd init`."
183+
if isTransportFailure(err) {
184+
suggestion = "Re-run via `azd ai agent doctor`; the extension cannot reach azd's gRPC channel."
185+
}
186+
return Result{
187+
Status: StatusFail,
188+
Message: fmt.Sprintf("failed to get project config: %v", err),
189+
Suggestion: suggestion,
190+
}
191+
}
192+
if resp == nil || resp.Project == nil {
193+
return Result{
194+
Status: StatusFail,
195+
Message: "failed to get project config (is there an azure.yaml?)",
196+
Suggestion: "Run from a directory containing `azure.yaml`, or initialize one with `azd init`.",
197+
}
198+
}
199+
200+
projectPath := resp.Project.Path
201+
// Collect agent service entries in a stable order. protobuf
202+
// `Services` is a map, so iteration order is non-deterministic
203+
// — sorting by service name keeps the failure list (and the
204+
// validatedPaths Detail) reproducible.
205+
type agentSvc struct {
206+
name string
207+
rel string
208+
}
209+
var agents []agentSvc
210+
for _, s := range resp.Project.Services {
211+
if s == nil || s.Host != agentHost {
212+
continue
213+
}
214+
agents = append(agents, agentSvc{name: s.Name, rel: s.RelativePath})
215+
}
216+
sort.Slice(agents, func(i, j int) bool { return agents[i].name < agents[j].name })
217+
218+
var validatedPaths []string
219+
var failures []string
220+
for _, a := range agents {
221+
yamlPath := filepath.Join(projectPath, a.rel, "agent.yaml")
222+
if pathErr := validateAgentYAML(yamlPath); pathErr != nil {
223+
failures = append(failures, fmt.Sprintf("%s: %v", a.name, pathErr))
224+
continue
225+
}
226+
validatedPaths = append(validatedPaths, yamlPath)
227+
}
228+
229+
if len(failures) > 0 {
230+
return Result{
231+
Status: StatusFail,
232+
Message: fmt.Sprintf(
233+
"agent.yaml validation failed for %d service(s): %s",
234+
len(failures), strings.Join(failures, "; ")),
235+
Suggestion: "Fix the YAML syntax or ensure agent.yaml exists in each service directory.",
236+
Details: map[string]any{
237+
"failures": failures,
238+
"validatedPaths": validatedPaths,
239+
},
240+
}
241+
}
242+
243+
return Result{
244+
Status: StatusPass,
245+
Message: fmt.Sprintf("agent.yaml valid for %d service(s)", len(validatedPaths)),
246+
Details: map[string]any{
247+
"validatedPaths": validatedPaths,
248+
},
249+
}
250+
},
251+
}
252+
}
253+
254+
// validateAgentYAML reads the file at path and ensures it parses as a
255+
// ContainerAgent. Returns the underlying read/parse error verbatim so
256+
// the caller can attribute it to the offending service.
257+
func validateAgentYAML(path string) error {
258+
data, err := os.ReadFile(path) //nolint:gosec // G304: path is constructed from azd-resolved project root + service-relative path
259+
if err != nil {
260+
return fmt.Errorf("read %s: %w", path, err)
261+
}
262+
var parsed agent_yaml.ContainerAgent
263+
if err := yaml.Unmarshal(data, &parsed); err != nil {
264+
return fmt.Errorf("parse %s: %w", path, err)
265+
}
266+
return nil
267+
}
268+
269+
// priorFailed reports whether the prior results contain a Fail entry
270+
// for the given check ID. Used for skip-cascade decisions across the
271+
// local-checks chain.
272+
func priorFailed(prior []Result, id string) bool {
273+
for _, p := range prior {
274+
if p.ID == id && p.Status == StatusFail {
275+
return true
276+
}
277+
}
278+
return false
279+
}

0 commit comments

Comments
 (0)