-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype of Dev/catalog interface #499
Conversation
namespace: Optional[Namespace] = get_namespace( | ||
*args, | ||
namespace=namespace, | ||
**kwargs, | ||
) | ||
|
||
if not namespace: | ||
raise ValueError(f"Namespace `{namespace}` does not exist.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove this get_namespace
call and just catch/reraise a more clear ValueError from the commit itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the transaction expects a metafile so we would need to do a get_namespace using the id to get the metafile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... can we just create the namespace to delete in-memory, since it will already contain the unique locator that the transaction needs to find the delete target?
namespace = Namespace.of(NamespaceLocator.of(namespace))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed this comment, going to see if I can get it to work.
|
||
table_version = storage_impl.get_table_version(namespace, table_name, table.latest_table_version, catalog=catalog) | ||
|
||
stream = storage_impl.get_stream( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future when we have multiple streams, how are users going to specify which stream they want in get / list / etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for example, I'm imagining they'll explicitly need to ask for the iceberg
stream if they want to query the Iceberg variant, so it seems like we'll need a way for them to explicitly pass in at least StreamFormat
for all table-level catalog APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to leave this out for my PR but will add a TODO to address this.
deltacat/tests/catalog/main/test_catalog_impl_table_operations.py
Outdated
Show resolved
Hide resolved
) | ||
|
||
# Verify the error message | ||
assert f"Table {self.test_namespace}.{table_name} already exists" in str(excinfo.value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you could put the expected error string in the with pytest.raises
line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we have a lot of better custom error types in exceptions.py
that we should start making better use of (and adding new exception types as needed) to avoid any cases that otherwise require string inspection like this (we probably need a corresponding high-level issue to comb through all of the high-level base Python errors we're throwing in the native DeltaCAT catalog/storage impl, and turn them into more appropriately typed DeltaCAT exceptions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the ones directly included in my changes but there are still some errors thrown in the storage level that need to be updated. Don't want to dilute this cr will track a separate ticket for it.
f0788b0
to
6e43c96
Compare
table: Optional[Table] = get_table( | ||
*args, | ||
namespace=namespace, | ||
table_name=name, | ||
**kwargs, | ||
) | ||
|
||
if not table: | ||
raise TableNotFoundError(f"Table `{namespace}.{name}` does not exist.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to delete_namespace
below, can we remove this get_table
call and just catch/reraise a more clear ValueError from the commit itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Remaining comments are relatively low-priority, but would be great if they can also be addressed prior to merge (or, if not, feel free to submit a small fast-follow PR). Also, let's just ensure that all failing approval workflows have succeeded prior to merge.
Summary
This is a WIP progress implementation of the v2 catalog interface table functions.
Checklist
Unit tests covering the changes have been added
E2E testing has been performed
Additional Notes
Any additional information or context relevant to this PR.