-
Notifications
You must be signed in to change notification settings - Fork 1.7k
storage endpoint to order assets by materialization timestamp #29173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
7449d67
to
65e9a79
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you already have the 10k nodes on the frontend, i'm confused why you would need to re-order them on the backend? Could the sorting be done on the frontend instead using information gained from the original query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i might be missing something. But it seems like if this information is already available on the asset record, we could get it the first time it was queried and keep it ready if the user reorders things
If we are going to order by on the backend, I think we will want an index for that, even if we are passing in the list of keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe a path forward here is to start including the last_materialization_timestamp information on the query that returns the original asset nodes? so that it is immediately available to use for sorting on the frontend?
Chatted with @salazarm about having the timestamp on the |
i think even in that world it would still make more sense to add this timestamp as a field on AssetNode that gets refetched periodically using the existing resolvers, rather than have a custom endpoint for re-ordering a known set of existing asset keys |
|
||
query = ( | ||
db_select([AssetKeyTable.c.asset_key]) | ||
.order_by(order_by) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the key thing is that doing this order by on the backend/in postgres only makes sense if postgres knows more about how to order it or paginate it efficiently than the frontend or caller does, which isn't the case here since we don't have an index
Yeah so we have a mechanism to watch in progress runs for runs that are launched by the user, but it requires also fetching inProgressRunIds. But yeah I agree with Daniel this doesn't need to be a new resolver. But also GrapheneAssetNode doesn't work for external assets to I think we probably want this on GrapheneAsset |
Summary & Motivation
From the FE we want to be able to order the assets in the Asset Catalog by the time they were most recently materialized/observed/failed to materialize.
Existing query pattern in the FE:
Rather than modifying the query that fetches all assets from the DB to return in time-sorted order (which would require updating cursor logic, etc) this PR adds an endpoint that takes a list of asset keys and returns that list of asset keys in sorted order. This way we don't have to refetch the asset nodes when a user switches from time-ordered to alphabetical sorting.
New query pattern would be:
The endpoint also has an option to sort all asset nodes to cover the case when no catalog view is selected.
Sorting caveat:
latest_materialization_timestamp
doesn't get updated onFAILED_TO_MATERIALIZE
events, which means the asset would be sorted by the correspondingPLANNED
event. We'll likely want to change this in the future so that the failure events are used in the sorting, but are fine with this inconsistency for the internal observe releaseinternal pr https://github.com/dagster-io/internal/pull/14946
How I Tested These Changes
Changelog