Open
Description
What's the issue or suggestion?
When using the GraphQL api to get the assets, I see many assets that were defined at one point, but have long been removed from the asset definitions (the code that defined them was deleted).
I don't see a way to filter them out? Or a process to clean them...
The only way I can infer that they are gone is that they "dependencies" field is Null instead of an empty list.
Additional information
This is how I'm fetching the data:
import asyncio
from gql import Client, gql
from gql.transport.aiohttp import AIOHTTPTransport
import polars as pl
async def query_dagster(query:str, timeout=10):
transport = AIOHTTPTransport(url=DAGSTER_URL)
async with Client(
transport=transport,
fetch_schema_from_transport=True,
execute_timeout = timeout
) as session:
gql_query = gql(query)
result = await session.execute(gql_query)
return result
def get_all_dagster_assets(timeout:int=60) -> pl.DataFrame:
'''Return a dataframe with all Dagster asset'''
def fetch_raw_data(timeout:int=timeout) -> pl.DataFrame:
q = """
query {
assetsOrError {
__typename
... on AssetConnection {
nodes {
key { path }
definition {
description
dataVersion
groupName
type {
... on RegularDagsterType {displayName}
}
dependedBy {
asset {
assetKey { path }
}
}
dependencies {
asset {
assetKey { path }
}
}
}
assetMaterializations(limit: 1) {
runId
timestamp
metadataEntries {
label
}
}
}
}
... on PythonError {
message
}
}
}
"""
assetInfo = asyncio.run(query_dagster(q, timeout=timeout))
node_list: list[dict] = assetInfo["assetsOrError"]["nodes"]
df = pl.DataFrame(node_list)
return df
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.