|
| 1 | +--- |
| 2 | +name: staged-discovery |
| 3 | +description: Add staged discovery support to a provider. Use when the user wants to implement staged/phased discovery, break down discovery into stages, add OptionStagedDiscovery support, or optimize a provider's memory usage during discovery. Triggers on requests like "add staged discovery to gcp", "implement staged discovery for aws", "break down discovery for <provider>", or "optimize <provider> discovery". |
| 4 | +argument-hint: "<provider-name> (e.g., aws, gcp, k8s, azure)" |
| 5 | +--- |
| 6 | + |
| 7 | +# Add Staged Discovery to a Provider |
| 8 | + |
| 9 | +Implement staged (multi-phase) discovery for a provider so that `AssetExplorer` can traverse the provider's resource hierarchy one level at a time, releasing memory after each scope is closed. |
| 10 | + |
| 11 | +**Background:** See `docs/adr/002-staged-discovery.md` for the full design rationale and `docs/adr/001-asset-explorer-lazy-discovery.md` for how `AssetExplorer` drives the traversal. |
| 12 | + |
| 13 | +**Reference implementation:** The K8s provider in `providers/k8s/resources/discovery.go` — study `Discover()`, `discoverClusterStage()`, and `discoverNamespaceStage()` as the canonical example. |
| 14 | + |
| 15 | +## Prerequisites |
| 16 | + |
| 17 | +Before starting, understand the provider's resource hierarchy. Every provider has a natural tree: |
| 18 | +- **K8s:** cluster → namespaces → workloads (pods, deployments, etc.) |
| 19 | +- **GCP:** organization → projects → services → resources |
| 20 | +- **AWS:** organization → accounts → regions → resources |
| 21 | +- **Azure:** tenant → subscriptions → resource groups → resources |
| 22 | + |
| 23 | +Each level of this tree becomes a discovery stage. Ask the user to confirm the hierarchy if it's not obvious. |
| 24 | + |
| 25 | +## Step-by-Step Implementation |
| 26 | + |
| 27 | +### Step 1: Identify the discovery entry point |
| 28 | + |
| 29 | +Find the provider's `Discover()` function. It is typically in `providers/<name>/resources/discovery.go` or called from `providers/<name>/provider/provider.go` during connection setup. |
| 30 | + |
| 31 | +```bash |
| 32 | +# Find the discovery function |
| 33 | +grep -rn "func.*Discover" providers/<name>/resources/ providers/<name>/provider/ |
| 34 | +``` |
| 35 | + |
| 36 | +Read the existing discovery logic thoroughly. Understand: |
| 37 | +- What assets are currently returned (the full set) |
| 38 | +- How platform IDs are constructed |
| 39 | +- How connection configs are set on child assets |
| 40 | +- Whether `WithParentConnectionId` is used and where |
| 41 | + |
| 42 | +### Step 2: Add the staged discovery router |
| 43 | + |
| 44 | +Modify the `Discover()` function to check for `OptionStagedDiscovery` and route to stage-specific functions. The legacy path MUST remain unchanged — older clients that don't set the flag must continue working. |
| 45 | + |
| 46 | +```go |
| 47 | +import "go.mondoo.com/mql/v13/providers-sdk/v1/plugin" |
| 48 | + |
| 49 | +func Discover(runtime *plugin.Runtime, ...) (*inventory.Inventory, error) { |
| 50 | + conn := runtime.Connection.(YourConnection) |
| 51 | + invConfig := conn.InventoryConfig() |
| 52 | + |
| 53 | + if _, ok := invConfig.Options[plugin.OptionStagedDiscovery]; ok { |
| 54 | + // Route based on which stage we're in. |
| 55 | + // Use a provider-specific option to determine the current scope. |
| 56 | + if invConfig.Options["your-scope-option"] != "" { |
| 57 | + return discoverScopedStage(runtime, conn, invConfig) |
| 58 | + } |
| 59 | + return discoverRootStage(runtime, conn, invConfig) |
| 60 | + } |
| 61 | + |
| 62 | + // Legacy single-pass discovery — DO NOT MODIFY |
| 63 | + // TODO(v15): remove this once all clients use staged discovery |
| 64 | + return discoverLegacy(runtime, conn, invConfig) |
| 65 | +} |
| 66 | +``` |
| 67 | + |
| 68 | +**Important:** Rename the existing discovery function to `discoverLegacy` (or similar) and add a `TODO(v15)` comment. Do not delete it. |
| 69 | + |
| 70 | +### Step 3: Implement Stage 1 (root/top-level scope) |
| 71 | + |
| 72 | +Stage 1 discovers the top-level asset and its immediate children. Children are returned as assets with connection configs that trigger Stage 2 when connected. |
| 73 | + |
| 74 | +**Critical rules:** |
| 75 | +- Child assets that represent a new scope (e.g., namespaces, projects, regions) must NOT use `WithParentConnectionId`. They need their own independent runtime so their MQL resource cache is isolated and released when the scope is closed. |
| 76 | +- Clone the parent's connection config for each child, adding a scope option that identifies the next stage. |
| 77 | +- The `OptionStagedDiscovery` flag is propagated automatically by `Clone()`. |
| 78 | + |
| 79 | +```go |
| 80 | +func discoverRootStage(runtime *plugin.Runtime, conn YourConnection, invConfig *inventory.Config) (*inventory.Inventory, error) { |
| 81 | + in := &inventory.Inventory{Spec: &inventory.InventorySpec{ |
| 82 | + Assets: []*inventory.Asset{}, |
| 83 | + }} |
| 84 | + |
| 85 | + // 1. Discover the root asset itself (with platform IDs) |
| 86 | + rootAsset := &inventory.Asset{ |
| 87 | + PlatformIds: []string{rootPlatformId}, |
| 88 | + Name: conn.Name(), |
| 89 | + Platform: conn.Platform(), |
| 90 | + Connections: []*inventory.Config{invConfig.Clone(inventory.WithoutDiscovery())}, |
| 91 | + } |
| 92 | + in.Spec.Assets = append(in.Spec.Assets, rootAsset) |
| 93 | + |
| 94 | + // 2. Discover child scopes (e.g., projects, namespaces, regions) |
| 95 | + children, err := listChildScopes(conn) |
| 96 | + if err != nil { |
| 97 | + return nil, err |
| 98 | + } |
| 99 | + |
| 100 | + for _, child := range children { |
| 101 | + // Clone WITHOUT WithParentConnectionId — each child gets its own |
| 102 | + // runtime and MQL resource cache, released when the child is closed. |
| 103 | + childConfig := invConfig.Clone() |
| 104 | + childConfig.Options["your-scope-option"] = child.ID |
| 105 | + |
| 106 | + childAsset := &inventory.Asset{ |
| 107 | + PlatformIds: []string{child.PlatformId}, |
| 108 | + Name: child.Name, |
| 109 | + Platform: child.Platform, |
| 110 | + Connections: []*inventory.Config{childConfig}, |
| 111 | + } |
| 112 | + in.Spec.Assets = append(in.Spec.Assets, childAsset) |
| 113 | + } |
| 114 | + |
| 115 | + return in, nil |
| 116 | +} |
| 117 | +``` |
| 118 | + |
| 119 | +### Step 4: Implement Stage 2+ (scoped discovery) |
| 120 | + |
| 121 | +Each subsequent stage reads its scope from the connection config, discovers resources within that scope, and optionally emits further children for deeper stages. |
| 122 | + |
| 123 | +**Leaf assets within a scope SHOULD use `WithParentConnectionId`** to share the scope's API client cache. This avoids redundant API calls while keeping the cache scoped to the parent (not the root). |
| 124 | + |
| 125 | +```go |
| 126 | +func discoverScopedStage(runtime *plugin.Runtime, conn YourConnection, invConfig *inventory.Config) (*inventory.Inventory, error) { |
| 127 | + scopeId := invConfig.Options["your-scope-option"] |
| 128 | + |
| 129 | + in := &inventory.Inventory{Spec: &inventory.InventorySpec{ |
| 130 | + Assets: []*inventory.Asset{}, |
| 131 | + }} |
| 132 | + |
| 133 | + // Discover resources within this scope |
| 134 | + resources, err := listResourcesInScope(conn, scopeId) |
| 135 | + if err != nil { |
| 136 | + return nil, err |
| 137 | + } |
| 138 | + |
| 139 | + for _, res := range resources { |
| 140 | + resAsset := &inventory.Asset{ |
| 141 | + PlatformIds: []string{res.PlatformId}, |
| 142 | + Name: res.Name, |
| 143 | + Platform: res.Platform, |
| 144 | + // Leaf assets share the scope's API cache |
| 145 | + Connections: []*inventory.Config{invConfig.Clone( |
| 146 | + inventory.WithParentConnectionId(invConfig.Id), |
| 147 | + )}, |
| 148 | + } |
| 149 | + in.Spec.Assets = append(in.Spec.Assets, resAsset) |
| 150 | + } |
| 151 | + |
| 152 | + return in, nil |
| 153 | +} |
| 154 | +``` |
| 155 | + |
| 156 | +### Step 5: Gate resource methods at higher scopes (if needed) |
| 157 | + |
| 158 | +When the root scope is scanned, resource methods that load lower-scope data should return empty results to avoid loading everything into the root's cache. This is optional but important for large providers. |
| 159 | + |
| 160 | +```go |
| 161 | +func isRootScopedConnection(r *plugin.Runtime) bool { |
| 162 | + conn := r.Connection.(YourConnection) |
| 163 | + cfg := conn.InventoryConfig() |
| 164 | + if _, ok := cfg.Options[plugin.OptionStagedDiscovery]; !ok { |
| 165 | + return false // Legacy path — don't gate anything |
| 166 | + } |
| 167 | + return cfg.Options["your-scope-option"] == "" // Root scope = no child scope set |
| 168 | +} |
| 169 | + |
| 170 | +// In a resource method that should only run at child scope: |
| 171 | +func (r *mqlYourProvider) childScopedResources() ([]interface{}, error) { |
| 172 | + if isRootScopedConnection(r.MqlRuntime) { |
| 173 | + return []interface{}{}, nil // Empty at root scope — will be loaded per child |
| 174 | + } |
| 175 | + // ... normal implementation |
| 176 | +} |
| 177 | +``` |
| 178 | + |
| 179 | +### Step 6: Verify both paths produce the same assets |
| 180 | + |
| 181 | +Both the legacy and staged paths must discover the same final set of assets (same platform IDs, same names). They differ only in how discovery is chunked. |
| 182 | + |
| 183 | +```bash |
| 184 | +# Build and install |
| 185 | +make providers/build/<name> && make providers/install/<name> |
| 186 | + |
| 187 | +# Test legacy path (no staged discovery flag — simulates old client) |
| 188 | +# This should work exactly as before |
| 189 | +mql shell <provider-args> |
| 190 | + |
| 191 | +# Test staged path (AssetExplorer sets the flag automatically) |
| 192 | +# Verify the same assets appear |
| 193 | +mql shell <provider-args> |
| 194 | + |
| 195 | +# Run existing tests |
| 196 | +go test ./providers/<name>/... |
| 197 | +``` |
| 198 | + |
| 199 | +### Step 7: Update .lr.versions if new resources were added |
| 200 | + |
| 201 | +If you added any new resources or fields to support staged discovery, update the `.lr.versions` file: |
| 202 | + |
| 203 | +```bash |
| 204 | +make providers/mqlr |
| 205 | +./mqlr generate providers/<name>/resources/<name>.lr --dist providers/<name>/resources |
| 206 | +``` |
| 207 | + |
| 208 | +## Checklist |
| 209 | + |
| 210 | +- [ ] `Discover()` routes to staged vs legacy based on `OptionStagedDiscovery` |
| 211 | +- [ ] Legacy path is preserved unchanged with `TODO(v15)` comment |
| 212 | +- [ ] Stage 1 returns child scope assets WITHOUT `WithParentConnectionId` (cache isolation) |
| 213 | +- [ ] Stage 2+ returns leaf assets WITH `WithParentConnectionId` (cache sharing within scope) |
| 214 | +- [ ] Child connection configs include the scope option that triggers the next stage |
| 215 | +- [ ] `OptionStagedDiscovery` is propagated via `Clone()` to all child configs |
| 216 | +- [ ] Resource methods at root scope are gated to avoid loading child-scope data into root cache |
| 217 | +- [ ] Both legacy and staged paths produce the same set of assets |
| 218 | +- [ ] `go build ./providers/<name>/...` compiles |
| 219 | +- [ ] `go test ./providers/<name>/...` passes |
| 220 | +- [ ] `make test/lint` passes |
0 commit comments