-
Notifications
You must be signed in to change notification settings - Fork 8
ADX operator maintain named entity-groups #1027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implement Stage 1 of named entity groups migration for federated ADX clusters.
This creates named entity groups based on spoke database inventory, replicated
across all hub databases.
Changes:
- Add mapSpokeDatabases() to map spoke databases to their cluster endpoints
- Add generateEntityGroupDefinitions() to create entity group KQL statements
- Integrate entity group generation into FederateClusters before function creation
- Use drop-then-create pattern since ADX doesn't support .create-or-alter for entity groups
Entity groups follow the naming pattern {SpokeDatabaseName}Spoke (e.g., MetricsSpoke)
and contain all cluster endpoints that have that database.
Includes comprehensive unit tests for:
- Spoke database mapping
- Entity group generation and naming
- Multi-endpoint scenarios
- Deterministic ordering
Documentation updated in concepts.md and designs/operator.md.
Resolved conflicts in operator/adx.go and operator/adx_test.go: - Integrated entity group definitions (from branch) with hub tables ensuring (from main) - Combined view support tests from main with entity group tests from branch - Updated step numbering in FederateClusters to accommodate both features
Add nil check for HeartbeatTTL in the guard clause at the beginning of FederateClusters to prevent a panic when dereferencing the pointer later in queryHeartbeatTable.
Remove the following unused functions and their tests: - getBestAvailableSKU: queries Azure for SKUs but never called - collectInventoryByDatabase: aggregates schemas but never called - databaseExists: checks database existence but never called - recommendedSKUs variable: only used by getBestAvailableSKU These functions were defined but not used anywhere in production code.
Add named constants for requeue durations and script size limits: - requeueShort (1 minute): for in-progress operations - requeueMedium (5 minutes): for provider registration - requeueLong (10 minutes): for heartbeat/federation cycles - maxKustoScriptSz (1MB): max size for Kusto script batches This improves maintainability by centralizing these values and making their purpose clear through documentation.
Previously, status update errors in non-critical code paths were silently discarded. This adds logging for these cases to improve observability while still not blocking reconciliation on status update failures. - Add logSetClusterCondition helper method that wraps setClusterCondition and logs any errors at WARN level - Replace all '_ = r.setClusterCondition' calls with logSetClusterCondition - Add explicit error logging for direct r.Status().Update calls in the CriteriaExpression handling code
Add Go doc comments for the following functions to improve code readability and maintainability: - diffSkus: Explains the SKU comparison logic and when updates are triggered - diffIdentities: Describes identity reconciliation and preservation of unmanaged identities - toSku: Documents string to AzureSKUName conversion with validation - toTier: Documents string to AzureSKUTier conversion with validation - toDatabase: Describes the Azure resource ID construction for databases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances the ADX operator's federation capabilities by introducing named entity groups for better manageability of cross-cluster queries, while also addressing several bugs and code quality improvements.
Key Changes
- Named Entity Groups: Replaces inline cluster lists in KQL functions with persistent named entity groups (e.g.,
MetricsSpoke,LogsSpoke) that are replicated across all hub databases - Critical Bug Fix: Adds nil pointer check for
HeartbeatTTLto prevent panic during federation reconciliation - Code Quality: Removes unused functions, extracts magic numbers into named constants, and adds comprehensive error logging for status updates
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| operator/adx.go | Implements named entity group generation and replication, adds HeartbeatTTL nil check, extracts constants (requeueShort/Medium/Long, maxKustoScriptSz), adds logSetClusterCondition helper, removes dead code (getBestAvailableSKU, collectInventoryByDatabase, databaseExists), adds function documentation |
| operator/adx_test.go | Removes tests for deleted functions, adds comprehensive tests for mapSpokeDatabases, generateEntityGroupDefinitions (covering replication, naming, multiple endpoints, deterministic ordering), updates TestGenerateKustoFunctionDefinitions to verify entity group references |
| docs/designs/operator.md | Documents named entity groups feature including naming pattern, drop-then-create workflow, and manageability benefits |
| docs/concepts.md | Updates federation sections to mention named entity groups alongside existing capabilities |
When referencing a stored entity group in the macro-expand operator, the 'entity_group' keyword should NOT be used. The correct syntax is: macro-expand MyEntityGroup as X ( X.TableName ) Not: macro-expand entity_group MyEntityGroup as X ( X.TableName ) The 'entity_group' keyword is only used in management commands (.create/.drop/.show entity_group) and inline entity group definitions with brackets.
Changed 'executing %d entity groups' to 'executing %d entity group statements' since each entity group generates 2 statements (drop + create).
Resolved conflict in operator/adx.go by keeping the PR's deletion of unused functions (collectInventoryByDatabase, databaseExists). Main had a fix for databaseExists (Name -> DatabaseName), but since the PR intentionally removes these unused functions, the fix is not needed.
Summary
This PR modifies the operator to maintain named entity-groups in the form of
{DatabaseName}Spokeand then utilize them when constructing our alias functions instead of inlining the full cluster list.There are also several bug that the LLM identified and fixed, each in its own commit to aid reviewing.
Changes
Bug Fixes
HeartbeatTTLinFederateClustersguard clause to prevent panic when the pointer is dereferenced inqueryHeartbeatTableCode Quality
getBestAvailableSKU,collectInventoryByDatabase,databaseExists, andrecommendedSKUsvariablerequeueShort(1 min),requeueMedium(5 min),requeueLong(10 min), andmaxKustoScriptSz(1MB) constants for better maintainabilitylogSetClusterConditionhelper and updated all ignored status update calls to log errors at WARN leveldiffSkus,diffIdentities,toSku,toTier, andtoDatabaseTesting