Skip to content

[ENH]: add orchestrator to construct version graph for garbage collection #4463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: feat-sysdb-batch-get-version-file-paths-method
Choose a base branch
from

Conversation

codetheweb
Copy link
Contributor

@codetheweb codetheweb commented May 6, 2025

Description of changes

Adds an orchestrator to construct the version graph for all collections in a fork tree to be used by garbage collection.

Test plan

How are these changes tested?

  • Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Added tests for new orchestrator.

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?

n/a

Copy link

github-actions bot commented May 6, 2025

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

if tracing::level_enabled!(tracing::Level::DEBUG) {
let dot_viz = Dot::with_config(&self.graph, &[]);
let encoded = BASE64_STANDARD.encode(format!("{:?}", dot_viz));
tracing::debug!(base64_encoded_dot_graph = ?encoded, "Constructed graph.");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logged graph repr can be pasted into any graphviz-compatible viewer for debugging (e.g. https://graph.flyte.org)

@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch 2 times, most recently from 0d136b9 to 6ad39d0 Compare May 13, 2025 17:35
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from 2d8c1b7 to fafe998 Compare May 13, 2025 17:35
Copy link
Contributor

propel-code-bot bot commented May 13, 2025

Add Orchestrator for Constructing Collection Version Graph for GC

This PR introduces a new orchestrator subsystem in the Rust garbage collector service that constructs a version graph representing versions and dependencies across all collections in a collection fork tree. This graph structure enables more robust and accurate garbage collection by tracking version histories and lineage dependencies, leveraging Petgraph for graph representation and the new fetch_version_file, fetch_lineage_file, and get_version_file_paths operators. The orchestrator coordinates asynchronous tasks to fetch version and lineage files from storage and sysdb, handling errors, missing data, and generating a directed acyclic graph suitable for GC analysis, with thorough integration and unit tests for various graph topologies.

Key Changes:
• Added construct_version_graph_orchestrator.rs implementing the core orchestrator to assemble a version graph using Petgraph.
• Introduced new operators: fetch_lineage_file, fetch_version_file (refactored for structured output), and get_version_file_paths for modular file retrieval and dependency extraction.
• Refactored FetchVersionFileOperator/Output and usages to parse/return decoded protobuf rather than opaque bytes, and improved error reporting (including strict UUID and proto decode validation).
• Added base64 and petgraph crates as dependencies, updated Cargo.toml and Cargo.lock accordingly.
• Refactored associated tests and added comprehensive coverage for version graph construction including scenarios with/without lineage files.
• Updated storage and orchestrator module interfaces for clarity and better debugging (e.g. implementing Debug for storage, improved error enums).

Affected Areas:
• rust/garbage_collector/src/construct_version_graph_orchestrator.rs (new)
• rust/garbage_collector/src/operators/fetch_version_file.rs
• rust/garbage_collector/src/operators/fetch_lineage_file.rs (new)
• rust/garbage_collector/src/operators/get_version_file_paths.rs (new)
• rust/garbage_collector/src/garbage_collector_orchestrator.rs (fetch usage)
• rust/garbage_collector/src/operators/mod.rs
• rust/storage/src/lib.rs (Debug impl)
• Cargo.toml
• Cargo.lock
• rust/garbage_collector/Cargo.toml

Potential Impact:

Functionality: Introduces a new capability to gather and represent complete collection/version dependency state for improved garbage collection logic; affects how future GC tasks understand which data is safe to delete.

Performance: Performs multiple storage and sysdb I/O tasks; may increase job execution time for complex collection trees, but uses async patterns.

Security: No new network or privilege escalation risks introduced, but ensures UUID validation and error propagation for malformed/missing lineage/version files.

Scalability: Designed to build graphs for fork trees of arbitrary size; Petgraph usage should scale acceptably unless the collection graph is extremely large.

Review Focus:
• Correctness and completeness of the constructed version graph, especially edge/node mapping and status handling.
• Robustness of async task orchestration and error propagation within the orchestrator, including early termination and channel handling.
• Test coverage adequacy for corner cases, e.g., missing lineage/version files, invalid data.
• Potential for excessive I/O calls in dense version/lineage graphs.

Testing Needed

• Run new and existing orchestrator and operator tests (cargo test in garbage_collector).
• Test with collections (in local/dev setups) with and without lineage files, and with dependencies across multiple collections.
• Verify error-handling paths by removing/mangling files or providing invalid UUIDs.
• Check integration with existing GC jobs to confirm no breakage.

Code Quality Assessment

rust/garbage_collector/src/operators/fetch_version_file.rs: Refactored for clarity and stronger typing; structured error handling and output.

rust/garbage_collector/src/operators/fetch_lineage_file.rs: Clean, idiomatic, and easy to follow.

rust/garbage_collector/src/operators/get_version_file_paths.rs: Straightforward, adheres to async/trait design.

Cargo. and TOML files*: Updated dependencies accurately.

rust/garbage_collector/src/construct_version_graph_orchestrator.rs: Well-structured, modular, with clearly documented error handling and tracing, but reviewer feedback suggests possible node reuse optimizations and comments needed for initial version semantics.

Best Practices

Modularity:
• Operators and orchestrator logic separated cleanly
• Error enums for each module

Error Handling:
• Comprehensive error types, proper context propagation, UUID and decode checks

Testing:
• Rich unit/integration test coverage for orchestrator and operators

Async Patterns:
• Follows idiomatic Rust async/await and component abstractions

Potential Issues

• Possible redundant node creations in Petgraph (see review), may impact very large version/lineage sets.
• Initial version number correctness should be commented or double-checked for all backends.
• Channel/async task management could mask subtle errors if not carefully tested with cancellation/aborted runs.
• Potential for slow performance if many remote file requests are triggered for deep/wide fork trees.

This summary was automatically generated by @propel-code-bot

@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch from 6ad39d0 to 93be17d Compare May 15, 2025 16:52
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from fafe998 to ef973c9 Compare May 15, 2025 16:52
@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch from 93be17d to ea366a5 Compare May 15, 2025 20:11
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from ef973c9 to bdc98f5 Compare May 15, 2025 20:11
@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch from ea366a5 to edf6743 Compare May 15, 2025 22:38
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from bdc98f5 to 8137f4d Compare May 15, 2025 22:38
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from 8137f4d to 14eb5e7 Compare May 15, 2025 22:43
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from 14eb5e7 to 721349a Compare May 15, 2025 23:26
@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch from edf6743 to 85adb8c Compare May 15, 2025 23:34
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from 721349a to 8f8a50b Compare May 15, 2025 23:34
@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch from 85adb8c to 372edaf Compare May 15, 2025 23:36
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from 8f8a50b to 45253b9 Compare May 15, 2025 23:36
@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch from 372edaf to 505c4ae Compare May 15, 2025 23:37
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from 45253b9 to 64df4b0 Compare May 15, 2025 23:38
@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch from 505c4ae to 3ece42b Compare May 16, 2025 16:41
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from 64df4b0 to 6e6f97f Compare May 16, 2025 16:41
@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch from 3ece42b to 5801be6 Compare May 19, 2025 23:38
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from 6e6f97f to d62f117 Compare May 19, 2025 23:38
@codetheweb codetheweb force-pushed the feat-sysdb-batch-get-version-file-paths-method branch from 5801be6 to 90ac80a Compare May 19, 2025 23:45
@codetheweb codetheweb force-pushed the feat-gc-construct-version-graph branch from d62f117 to cb3e44a Compare May 19, 2025 23:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants