Skip to content

Conversation

@pjdotson
Copy link
Contributor

@pjdotson pjdotson commented Jan 12, 2026

Issue Pull Request

Linear Issue

SY-3039

Description

Fix ontology relationships with node metrics channels by creating a metrics and a node group to group all channels within.

Basic Readiness

  • I have performed a self-review of my code.
  • I have added relevant tests to cover the changes to CI.
  • I have added needed QA steps to the release candidate template that cover these changes.
  • I have updated in-code documentation to reflect the changes.
  • I have updated user-facing documentation to reflect the changes.

Greptile Overview

Greptile Summary

This PR fixes the ontology relationships for node metrics channels by introducing a proper hierarchical structure. Previously, metrics channels were created as direct children of the root "Channels" group, making them difficult to organize and navigate.

Key Changes

Ontology Hierarchy: Introduces a two-level group structure:

  • ChannelsMetricsNode {hostKey} → [individual metric channels]

This provides better organization by grouping all metrics together, then separating them by node.

Channel Creation Control: Adds CreateWithoutGroupRelationship option to channel creation, allowing channels to be created without automatically linking to the default Channels group. This is essential for metrics channels that need custom parent relationships.

Relationship Cleanup: After establishing the new hierarchy, the code explicitly removes any existing relationships between the Channels group and metric channels (lines 186-192 in service.go). This handles the migration case where metrics channels may have been previously linked to the Channels group.

Error Handling Improvement: The group.Service.CreateOrRetrieve method now properly handles non-NotFound errors using errors.Skip, preventing the function from continuing with invalid state.

Implementation Quality

The implementation follows established patterns in the codebase:

  • Uses RetrieveIfNameExists for idempotent channel creation
  • Properly threads CreateOptions through the channel creation stack
  • Test coverage includes verification of the new ontology relationships (metrics_test.go)
  • The relationship deletion loop is safe because gorp.Delete internally ignores NotFound errors

Testing

Comprehensive test coverage includes:

  • Service creation with new ontology requirements
  • Channel retrieval after service restart (idempotency testing)
  • Metric collection and data writing
  • Size metric calculations using new telem.Size improvements

Confidence Score: 4/5

  • This PR is safe to merge with minor considerations around idempotency
  • The changes are well-structured and solve the stated problem of organizing metrics channels in the ontology. The implementation follows existing patterns, includes comprehensive tests, and handles edge cases like idempotent group creation and relationship cleanup. Score is 4 instead of 5 due to the complexity of relationship deletion in a loop during service startup, which could theoretically cause issues if there are pre-existing orphaned relationships, though the code handles this correctly.
  • core/pkg/service/metrics/service.go - Pay attention to the relationship deletion loop (lines 186-192) during deployment to ensure it handles all edge cases correctly

Important Files Changed

File Analysis

Filename Score Overview
core/pkg/service/metrics/service.go 4/5 Added ontology group hierarchy (Metrics → Node) for metrics channels with proper relationship cleanup
core/pkg/distribution/group/service.go 5/5 Improved error handling in CreateOrRetrieve to properly skip non-NotFound errors
core/pkg/distribution/channel/writer.go 5/5 Added CreateWithoutGroupRelationship option for channel creation without auto-linking to parent group
core/pkg/distribution/channel/lease_proxy.go 5/5 Threaded CreateOptions through to maybeSetResources to support CreateWithoutGroupRelationship flag

Sequence Diagram

sequenceDiagram
    participant MS as MetricsService
    participant GS as GroupService
    participant CS as ChannelService
    participant OS as OntologyService
    
    MS->>GS: CreateOrRetrieve("Metrics", ChannelsGroupID)
    GS->>GS: Retrieve existing or create new
    GS-->>MS: metricsGroup
    
    MS->>GS: CreateOrRetrieve("Node {hostKey}", metricsGroupID)
    GS->>GS: Retrieve existing or create new
    GS-->>MS: nodeGroup
    
    MS->>CS: Create(indexChannel, CreateWithoutGroupRelationship)
    CS->>CS: Skip default group linking
    CS-->>MS: Index channel created
    
    MS->>OS: DefineRelationship(nodeGroup, ParentOf, indexChannel)
    OS-->>MS: Relationship created
    
    MS->>CS: CreateMany(metricChannels, CreateWithoutGroupRelationship)
    CS->>CS: Skip default group linking for all
    CS-->>MS: Metric channels created
    
    MS->>OS: DefineFromOneToManyRelationships(nodeGroup, ParentOf, metricChannels)
    OS-->>MS: Relationships created
    
    loop For each metric channel
        MS->>OS: DeleteRelationship(ChannelsGroup, ParentOf, metricChannel)
        OS-->>MS: Old relationship removed
    end
    
    Note over MS,OS: Hierarchy: Channels → Metrics → Node {hostKey} → [metric channels]
Loading

@pjdotson pjdotson self-assigned this Jan 12, 2026
@pjdotson pjdotson marked this pull request as ready for review January 12, 2026 23:44
@pjdotson pjdotson requested a review from emilbon99 January 12, 2026 23:44
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

…9-fix-ontology-relationship-with-node-metrics-channels
ctx,
&c.idx,
channel.RetrieveIfNameExists(),
channel.CreateWithoutGroupRelationship(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason you chose this API over a CreateWithParent style API like we have in other services? Overall ok with this but we need to decide on a consistent strategy at some point for how we decide on ontology parents

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It felt like it fit a little bit more with how the creation of channels is already done
with the create options. I'm not married to this API but I thought it was the simplest
to get in and not worth overthinking until we put more effort to standardizing how the
ontology is used throughout all services.

// delete any existing relationships between the parent Channels group and the
// metrics channels
for _, ch := range metricChannels {
if err := cfg.Ontology.NewWriter(nil).DeleteRelationship(ctx, cfg.Channel.Group().OntologyID(), ontology.ParentOf, ch.OntologyID()); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would I be right in thinking that This overflows 88 chars?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

close(s.stopCollector)
return s.shutdown.Close()
}
func (s *Service) Close() error { close(s.stopCollector); return s.shutdown.Close() }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you didn't like these?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed my mind and am good on these as long as they don't overflow the 88 character limit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants