Skip to content

Conversation

@stack72
Copy link
Contributor

@stack72 stack72 commented Nov 24, 2025

Summary

Adds a new GET endpoint to Luminork that provides a comprehensive,
pre-processed review of all changes in a change set. This endpoint
aggregates and processes data from multiple materialized views to
deliver a clean, ready-to-consume change summary optimized for the
Luminork API. ## Endpoint
GET /v1/w/{workspace_id}/change-sets/{change_set_id}/review
Response:

{
"components": [
    {
    "id": "01FXNV4P...",
    "name": "My EC2 Instance",
    "schemaName": "AWS EC2 Instance",
    "diffStatus": "Modified",
    "attributeDiffTrees": [
        {
        "path": "/domain/Region",
        "diff": {
            "old": { "$source": {...}, "$value": "us-east-1" },
            "new": { "$source": {...}, "$value": "us-west-2" }
        }
        }
    ],
    "actionDiffs": []
    }
]
}

Key Features

🔄 On-Demand Building

  • Not part of incremental MV index - Built only when requested via API
  • Uses SlowRT to avoid blocking async runtime during expensive
    operations - No caching (data changes frequently with every component
  • modification)

🎯 Smart Data Aggregation

Combines data from multiple sources:

  • ComponentList - Basic component info and metadata
  • ComponentDiff - Attribute-by-attribute differences vs HEAD
  • ActionDiffList - Action changes per component
  • ErasedComponents - Components removed from HEAD

🧹 Frontend Processing Applied

Replicates frontend logic for clean, useful output:

  • Filters uninteresting diffs:
  • Internal attributes (/si/type, /si/color)
  • Empty schema defaults (empty objects/arrays, null, "")
  • Identical old/new values (can happen on upgrades)
  • Object field placeholders at top levels
  • Corrects diff status:
  • Sets to None if no meaningful changes after filtering
  • Sets to Modified if action diffs exist even without attribute changes
  • Properly handles toDelete + Removed combinations
  • Flattened structure:
  • Returns flat list of { path, diff } pairs
  • Sorted alphabetically for consistent ordering
  • No nested tree complexity

📊 Component Ordering

Components sorted by status: Added → Modified → Removed

Design Decisions

Why not a traditional MV?

  • Review data changes on every component modification
  • Caching would cause stale data
  • On-demand building is more appropriate

Why no ResourceDiff?

  • Would make payload massive for large change sets
  • Clients can fetch per-component via existing endpoints when needed

Why flatten attribute trees?

  • Simpler structure: [{ path, diff }] vs nested trees
  • Easier for clients to consume
  • Full paths already provide hierarchy context

@github-actions
Copy link

github-actions bot commented Nov 24, 2025

Dependency Review

✅ No vulnerabilities or OpenSSF Scorecard issues found.

Scanned Files

None

@stack72 stack72 requested a review from nickgerace November 24, 2025 19:06
Comment on lines +68 to +71
for &component_id in &component_ids {
let component_in_list =
crate::component::assemble_in_list(ctx.clone(), component_id).await?;
let component_diff =
crate::component::component_diff::assemble(ctx.clone(), component_id).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could get very expensive in a big workspace. Maybe the MV itself would become huge too. We could avoid paginating for now, but I'm curious how big these will be in frigg.

cc @jhelwig do you have thoughts since you worked on the Deployment MV?
cc @fnichol as well for frigg knowledge

Comment on lines +126 to +128
Ok(LuminorkChangeSetReview {
id: workspace_pk,
components,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically though, this looks great to me

@jhelwig
Copy link
Contributor

jhelwig commented Nov 24, 2025

This endpoint aggregates and processes data from multiple materialized views to deliver a clean, ready-to-consume change summary optimized for the Luminork API.

What's the intended use of the API endpoint? It's not clear to me from this description. If this is intended for MCP use, I think a better pattern would be for there to be a "here are the things with differences" list, and a multi-get API endpoint where agents can retrieve multiple individual diffs at once. As @nickgerace pointed out, there are definitely concerns around the size of the overall diff (both in our ability to store them, but also in whether they'll blow through the token limit of any agent for the return size of a tool call).

If this is limited to a minimal overview, then the size concern should no longer be an issue (it would have to be a truly massive amount of changes in a change set to exceed the tool use token limit for responses). If there is a way to multi-get the details for individual components, then agents can dynamically size their queries, and can group things more intelligently into batches (review SSM param changes, then bucket changes, then..., etc).

Was there a different intent behind the endpoint from what I'm thinking here?

@stack72
Copy link
Contributor Author

stack72 commented Nov 24, 2025

This endpoint aggregates and processes data from multiple materialized views to deliver a clean, ready-to-consume change summary optimized for the Luminork API.

What's the intended use of the API endpoint? It's not clear to me from this description. If this is intended for MCP use, I think a better pattern would be for there to be a "here are the things with differences" list, and a multi-get API endpoint where agents can retrieve multiple individual diffs at once. As @nickgerace pointed out, there are definitely concerns around the size of the overall diff (both in our ability to store them, but also in whether they'll blow through the token limit of any agent for the return size of a tool call).

If this is limited to a minimal overview, then the size concern should no longer be an issue (it would have to be a truly massive amount of changes in a change set to exceed the tool use token limit for responses). If there is a way to multi-get the details for individual components, then agents can dynamically size their queries, and can group things more intelligently into batches (review SSM param changes, then bucket changes, then..., etc).

Was there a different intent behind the endpoint from what I'm thinking here?

The diff here is already slimmed down - it doesn't return code diffs, it just returns the components and their AV changes. This isn't just for AI Agent, it's also for the CLI. I ran a diff of a change set with 30 component changes and the output was very readable!

@stack72 stack72 force-pushed the luminork-reviews-endpoint branch from a5cf35c to d8d00eb Compare November 24, 2025 23:43
Adds a new GET endpoint to Luminork that provides a comprehensive,
pre-processed review of all changes in a change set. This endpoint
aggregates and processes data from multiple materialized views to
deliver a clean, ready-to-consume change summary optimized for the
Luminork API. ## Endpoint
GET /v1/w/{workspace_id}/change-sets/{change_set_id}/review
**Response:**
```json
{
"components": [
    {
    "id": "01FXNV4P...",
    "name": "My EC2 Instance",
    "schemaName": "AWS EC2 Instance",
    "diffStatus": "Modified",
    "attributeDiffTrees": [
        {
        "path": "/domain/Region",
        "diff": {
            "old": { "$source": {...}, "$value": "us-east-1" },
            "new": { "$source": {...}, "$value": "us-west-2" }
        }
        }
    ],
    "actionDiffs": []
    }
]
}

🔄 On-Demand Building

- Not part of incremental MV index - Built only when requested via API
- Uses SlowRT to avoid blocking async runtime during expensive
operations - No caching (data changes frequently with every component
- modification)

🎯 Smart Data Aggregation

Combines data from multiple sources:
- ComponentList - Basic component info and metadata
- ComponentDiff - Attribute-by-attribute differences vs HEAD
- ActionDiffList - Action changes per component
- ErasedComponents - Components removed from HEAD

🧹 Frontend Processing Applied

Replicates frontend logic for clean, useful output:
- Filters uninteresting diffs:
- Internal attributes (/si/type, /si/color)
- Empty schema defaults (empty objects/arrays, null, "")
- Identical old/new values (can happen on upgrades)
- Object field placeholders at top levels
- Corrects diff status:
- Sets to None if no meaningful changes after filtering
- Sets to Modified if action diffs exist even without attribute changes
- Properly handles toDelete + Removed combinations
- Flattened structure:
- Returns flat list of { path, diff } pairs
- Sorted alphabetically for consistent ordering
- No nested tree complexity

📊 Component Ordering

Components sorted by status: Added → Modified → Removed

Why not a traditional MV?
- Review data changes on every component modification
- Caching would cause stale data
- On-demand building is more appropriate

Why no ResourceDiff?
- Would make payload massive for large change sets
- Clients can fetch per-component via existing endpoints when needed

Why flatten attribute trees?
- Simpler structure: [{ path, diff }] vs nested trees
- Easier for clients to consume
- Full paths already provide hierarchy context
```
@stack72 stack72 force-pushed the luminork-reviews-endpoint branch from d8d00eb to 9295911 Compare November 26, 2025 14:47
@stack72 stack72 marked this pull request as draft November 26, 2025 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants