Skip to content

[gql] latestEventSortKey resolver for GrapheneAsset so that we can order by id #29264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jamiedemaria
Copy link
Contributor

@jamiedemaria jamiedemaria commented Apr 14, 2025

Summary & Motivation

resolver on GrapheneAsset that returns the latest storage id of any event associated with that asset. This lets us sort assets by the time they were last "modified" (planned, materialized, failed, observed).

Figured i'd do storage id since its 1) monotonically increasing and 2) already available for planned events on AssetEntry (we don't store the full event or the event timestamp for planned events)

How I Tested These Changes

Changelog

Insert changelog entry or delete this section.

Copy link
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@jamiedemaria jamiedemaria marked this pull request as ready for review April 14, 2025 19:50
@jamiedemaria
Copy link
Contributor Author

@gibsondan @salazarm lmk what you think. i'll add tests if this seems like the right approach

Copy link
Member

@gibsondan gibsondan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the high level strategy here of fetching from the asset record makes a lot of sense to me. I'm surprised that this is the first time this class has needed to do that, i guess most of the other places we do it is in GrapheneAssetNode?

We might want to abstract away the fact that it's a storage ID under the hood as I know we've been trying to move away from having that in our public API (to give us some more flexibility to change how things are stored in the future). That could be like a latestEventSortKey or something? @shalabhc might have thoughts there as I know he's been thinking about this class of issues

I think the main thing would be to double check that this doesn't result in additional surprise data fetching of the asset record in the queries where we are going to use it - I think AssetRecord loader caching should prevent that from happening?

@jamiedemaria jamiedemaria requested a review from salazarm April 14, 2025 21:54
@jamiedemaria
Copy link
Contributor Author

That could be like a latestEventSortKey or something?

Yeah, happy to rename, pinning us to storage id is not ideal

I think the main thing would be to double check that this doesn't result in additional surprise data fetching of the asset record in the queries where we are going to use it - I think AssetRecord loader caching should prevent that from happening?

I'm not 100% on how the AssetRecord loader works wrt caching, do you know who the right person to ask about it is? maybe @alangenfeld?

@jamiedemaria jamiedemaria force-pushed the jamie/order-by-resolver branch from d8ea8c9 to a3aff0c Compare April 15, 2025 14:54
Copy link

github-actions bot commented Apr 15, 2025

Deploy preview for dagit-core-storybook ready!

✅ Preview
https://dagit-core-storybook-4t337ikvo-elementl.vercel.app
https://jamie-order-by-resolver.core-storybook.dagster-docs.io

Built with commit 713d737.
This pull request is being automatically deployed with vercel-action

@jamiedemaria jamiedemaria changed the title [gql] latest event id resolver for Asset so that we can order by id [gql] latestEventSortKey resolver for GrapheneAsset so that we can order by id Apr 15, 2025
@jamiedemaria jamiedemaria force-pushed the jamie/order-by-resolver branch from e79234a to 713d737 Compare April 15, 2025 16:19
@jamiedemaria jamiedemaria requested a review from gibsondan April 15, 2025 16:46
@jamiedemaria
Copy link
Contributor Author

@gibsondan - i added tests and did the rename. this is good for another round of review while i figure out the caching thing

@alangenfeld
Copy link
Member

So the caching will make it so any call to AssetRecord.gen() with the same request context and key wont result in multiple keys. If we are creating AssetRecords in the request in some other way they wont be in the cache unless we put them there. For example direct instance.get_asset_record currently don't populate the cache and would need to update to be fetched via AssetRecord.gen_many

@alangenfeld
Copy link
Member

To put another way, the current setup will ensure that the calls get batched in to one if they happen across a list of assets.

But if we are already fetching the records directly from the instance in the request it wont dedupe against those without updating those callsites. I think it shouldn't be hard to update graphql callsites to gen_many since its effectively the same signature and resolvers can be made async.

@jamiedemaria
Copy link
Contributor Author

ok - we aren't fetching asset records via the instance in GrapheneAsset so i think that means we're good right? no risk of fetching the same thing twice if we weren't fetching it before this PR

Copy link
Member

were we fetching asset records at all though, via any method? I don't think they're super expensive but i could imagine some perf difference from going from zero asset records to one record per asset in the query

@jamiedemaria
Copy link
Contributor Author

Like in any gql resolver at all?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants