Skip to content

Allow extensions to supply a catalog location scope for AI model queries (avoid all-region fan-out) #8518

@JeffreyCA

Description

@JeffreyCA

Summary

AiModelService.ListModels always fetches the AI model catalog across every AIServices-supported region in the subscription, regardless of any region filter the caller passes. Both extension-facing entry points — AiModelService.ListModels (internal/grpcserver/ai_model_service.go) and PromptService.PromptAiModel (internal/grpcserver/prompt_service.go) — hardcode nil locations into the backing fetch and apply the requested region filter only as a post-response narrowing.

This means an extension that only supports a known subset of regions still pays for a full all-region fan-out, and has no way to tell azd "only query these regions." We should let callers supply a catalog location scope that replaces azd's "search all locations" logic.

Background / current behavior

In both handlers:

// Always fetch canonical model data across subscription locations.
// Location scoping is applied as a filter so model.Locations remains canonical.
models, err := s.modelService.ListModels(ctx, subscriptionId, nil) // nil = all regions
...
models = ai.FilterModels(models, filterOpts) // region filter applied AFTER fetch
  • ListModels(ctx, sub, nil) resolves all AIServices regions via ListLocations -> GetResourceSkuLocations and fans out one GetAiModels call per region (fetchModelsForLocations).
  • AiModelFilterOptions.locations is documented as a pure inclusion filter that does not rewrite AiModel.locations (grpc/proto/ai_model.proto:103). It trims what's returned but does not reduce the number of API calls.
  • A per-process cache (catalogCache, keyed subscriptionId:location) means back-to-back calls reuse regions already fetched, but the first call still queries the entire region set.

The nil is intentional: it keeps AiModel.Locations canonical so downstream consumers can discover alternative regions for a model. The cost is that there's no knob to scope the fetch.

Problem

Extensions (e.g. azure.ai.agents) operate against a fixed, known set of supported regions. They want to:

  1. Avoid spending time fetching catalog data for regions they'll never use.
  2. Still get full model info within their supported subset (including alternative locations).
  3. Have "alternative locations" naturally restricted to that subset.

None of this is achievable today — the region filter never reaches the fetch layer.

Proposed solution

Introduce a catalog location scope that is distinct from the existing inclusion filter:

  • Catalog scope (new): the universe of regions azd queries — replaces the ListLocations "all regions" default. Defines AiModel.Locations.
  • Inclusion filter (existing filter.locations): post-response narrowing of what's displayed/returned. Continues to not rewrite AiModel.Locations.

API change

Add a field to the relevant requests:

message ListModelsRequest {
  AzureContext azure_context = 1;
  AiModelFilterOptions filter = 2;
  // Universe of regions to query. Empty = all AIServices-supported
  // subscription locations (current behavior). When set, AiModel.locations
  // in the response is canonical *within this set*.
  repeated string catalog_locations = 3;
}

Add the same catalog_locations field to PromptAiModelRequest, and (for consistency) ResolveModelDeploymentsRequest / PromptAiDeploymentRequest.

Handler change

Forward the scope into the already-supported locations parameter instead of nil:

models, err := s.modelService.ListModels(ctx, subscriptionId, req.CatalogLocations)

AiModelService.ListModels(ctx, sub, locations) already scopes the fetch when locations is non-empty (pkg/ai/model_service.go:49-58); only the gRPC layer never forwarded it.

azd-side details

  • Intersect catalog_locations with AIServices SKU locations before fanning out and skip the rest, mirroring ListLocationsWithQuota (pkg/ai/model_service.go:237), to avoid GetAiModels calls in regions where AIServices isn't offered.
  • Backward compatible: empty catalog_locations => current all-region behavior. Purely additive.
  • Update the "fetch canonical model data" comments (prompt_service.go, ai_model_service.go) to reflect "across the provided catalog scope, or all subscription locations if none provided."
  • Quota methods (ListLocationsWithQuota, ListModelLocationsWithQuota, FilterModelsByQuotaAcrossLocations) already accept allowed_locations; the gap is only the catalog-fetch path.

Why separate scope vs. filter (important)

Overloading the existing filter.locations to also scope the fetch would truncate AiModel.Locations to the filtered regions. In azure.ai.agents, several prompts pass a single currentLocation (agentModelFilter([]string{currentLocation})), and the recovery flow later calls supportedModelLocations(model.Locations) (init_models.go:796,858) to discover alternative regions. A single-region fetch would collapse that list to one entry and break alternative-location recovery.

Keeping catalog scope (fetch universe) separate from the inclusion filter (post-response display) avoids this: a model selected with filter.locations = [eastus] still carries its full canonical-within-subset Locations.

Acceptance criteria

  • catalog_locations added to ListModelsRequest and PromptAiModelRequest (and deployment requests for consistency); proto regenerated.
  • Handlers forward catalog_locations to AiModelService.ListModels.
  • Provided scope is intersected with AIServices SKU locations before fan-out.
  • Empty catalog_locations preserves current all-region behavior (backward compatible).
  • AiModel.Locations is canonical within the provided scope; filter.locations remains post-response and does not rewrite Locations.
  • Tests covering: scoped fetch issues calls only for the scope; filter.locations still narrows display without truncating Locations; empty scope = unchanged behavior.
  • Doc comments / proto comments updated.

Follow-up (separate extension PR)

Once the core change lands, azure.ai.agents passes its supported-region set (from supportedRegionsForInit / the hosted-agent regions manifest) as catalog_locations on AI model requests. supportedModelLocations becomes largely a safety net since AiModel.Locations is already scoped to supported regions.

Notes

  • Two-PR sequence per the extension's contribution guide: land core (cli/azd) first, then update the extension to the newer azd dependency.

Metadata

Metadata

Assignees

No one assigned

    Labels

    aiAIarea/ext-frameworkExtension SDK, gRPC, runnerenhancementNew feature or improvementext-agentsazure.ai.{agents,connections,inspector,projects,routines,skills,toolboxes} extensions

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions