-
Notifications
You must be signed in to change notification settings - Fork 37
Path Relativization: Metafile and Locator Write Paths #502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ge workloads and multiple operation types per operation
…g for large workloads and multiple operation types per transaction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a pretty good start on the problem - thanks! I think this should be good to merge after addressing the open comments (which I expect to also trickle into updates to the included test suite).
Co-authored-by: Patrick Ames <[email protected]> Signed-off-by: David Martinez <[email protected]>
Co-authored-by: Patrick Ames <[email protected]> Signed-off-by: David Martinez <[email protected]>
Co-authored-by: Patrick Ames <[email protected]> Signed-off-by: David Martinez <[email protected]>
…st_transaction.py file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - thanks! Please address your newly added TODO
in a fast-follow PR. If some of our existing code paths are already incorrectly using relative locator and/or metafile paths before calling your path relativization method, then they'll probably need to be updated to only use absolute paths.
* path relativization algo, helpers, new signature for to_serializable IMPLEMENTED * relativizatio done and tested. Need to debug failing test_metafile_io.py tests * relativization done with tests and logging.debug calls * path relativization completed. Tests added, debug output removed * fixed formatting and ensured pylint passes * refactoed relativization code added additional tests to test with large workloads and multiple operation types per operation * refactored relativization code, cleaned up setter logic, added testing for large workloads and multiple operation types per transaction * Update deltacat/storage/model/transaction.py Co-authored-by: Patrick Ames <[email protected]> Signed-off-by: David Martinez <[email protected]> * Update deltacat/storage/model/transaction.py Co-authored-by: Patrick Ames <[email protected]> Signed-off-by: David Martinez <[email protected]> * Update deltacat/storage/model/transaction.py Co-authored-by: Patrick Ames <[email protected]> Signed-off-by: David Martinez <[email protected]> * Added PR review changes, updated test_abs_to_relative.py to be new test_transaction.py file * fixed delta file issue with absolute paths * cleaned code, lint --------- Signed-off-by: David Martinez <[email protected]> Co-authored-by: Patrick Ames <[email protected]>
* path relativization algo, helpers, new signature for to_serializable IMPLEMENTED * relativizatio done and tested. Need to debug failing test_metafile_io.py tests * relativization done with tests and logging.debug calls * path relativization completed. Tests added, debug output removed * fixed formatting and ensured pylint passes * refactoed relativization code added additional tests to test with large workloads and multiple operation types per operation * refactored relativization code, cleaned up setter logic, added testing for large workloads and multiple operation types per transaction * Update deltacat/storage/model/transaction.py Co-authored-by: Patrick Ames <[email protected]> Signed-off-by: David Martinez <[email protected]> * Update deltacat/storage/model/transaction.py Co-authored-by: Patrick Ames <[email protected]> Signed-off-by: David Martinez <[email protected]> * Update deltacat/storage/model/transaction.py Co-authored-by: Patrick Ames <[email protected]> Signed-off-by: David Martinez <[email protected]> * Added PR review changes, updated test_abs_to_relative.py to be new test_transaction.py file * fixed delta file issue with absolute paths * cleaned code, lint --------- Signed-off-by: David Martinez <[email protected]> Co-authored-by: Patrick Ames <[email protected]>
Summary
Implemented path relativization for paths embedded within the metafile and locator metadata of individual transaction operations. This occurs prior to serializing the transaction object to ensure relative paths are used throughout.
Rationale
The introduction of path relativization improves system portability by enabling dynamic resolution of paths relative to the catalog root. This simplifies user-specific configurations and enhances flexibility in permission management across different environments.
Changes
Modified
deltacat/storage/model/transaction.py
by implementing the relativization method. This includes the addition of operation-level logic and comprehensive error handling. Theto_serializable()
method has been refactored to accept the catalog root path, providing the necessary context for the relativization process.Added a new test file
deltacat/tests/storage/model/test_abs_to_relative.py
, to verify the correctness of the relativization logic. The tests include both unit tests for the helper functions and integration tests that simulate real-world transactions and operations.Impact
All tests have passed successfully, indicating that the changes do not break existing functionality. Further testing is recommended to ensure compatibility with all potential use cases and that no unintended regressions have been introduced.
Testing
deltacat/tests/storage/model/test_abs_to_relative.py
was added to validate the new relativization logic. The tests cover both the functionality of individual helper methods and full integration via mock transactions and operations.