-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[ENH]: add orchestrator to construct version graph for garbage collection #4463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
if tracing::level_enabled!(tracing::Level::DEBUG) { | ||
let dot_viz = Dot::with_config(&self.graph, &[]); | ||
let encoded = BASE64_STANDARD.encode(format!("{:?}", dot_viz)); | ||
tracing::debug!(base64_encoded_dot_graph = ?encoded, "Constructed graph."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logged graph repr can be pasted into any graphviz-compatible viewer for debugging (e.g. https://graph.flyte.org)
ea31ef5
to
4d29f0d
Compare
3185a2c
to
2d8c1b7
Compare
rust/garbage_collector/src/construct_version_graph_orchestrator.rs
Outdated
Show resolved
Hide resolved
rust/garbage_collector/src/construct_version_graph_orchestrator.rs
Outdated
Show resolved
Hide resolved
rust/garbage_collector/src/construct_version_graph_orchestrator.rs
Outdated
Show resolved
Hide resolved
rust/garbage_collector/src/construct_version_graph_orchestrator.rs
Outdated
Show resolved
Hide resolved
0d136b9
to
6ad39d0
Compare
2d8c1b7
to
fafe998
Compare
Add Orchestrator for Building Collection Version Graphs for GC This PR introduces a new orchestrator (ConstructVersionGraphOrchestrator) for constructing a version dependency graph across all collections in a fork tree, to support garbage collection workflows. Supporting Rust operators to fetch lineage files, fetch version files, and batch-fetch version file paths are added and integrated, alongside updates to relevant orchestrator logic and supporting changes to the garbage collector pipeline, storage API, and dependencies. Extensive test coverage for the graph construction is included, validating both simple and complex collection/version lineage cases. Key Changes: Affected Areas: Potential Impact: Functionality: Enables garbage collection logic to operate on entire version/fork trees rather than a single collection. Improves ability to trace dependencies and perform accurate collection/variant cleanup. Performance: Slight increase in orchestrator complexity; batched fetching of version files may help performance. Additional in-memory graph processing is limited by number of collections in a fork. Security: No new security risks introduced; new code inherits data access/authorization from existing storage and sysdb layers. Scalability: Graph-based approach scales to arbitrary fork trees; performance may need tuning for very large trees but core approach is scalable. Review Focus: Testing Needed• Run all Code Quality Assessmentrust/garbage_collector/src/operators/get_version_file_paths.rs: Simple, direct batch fetching and error propagation. rust/garbage_collector/src/garbage_collector_orchestrator.rs: Updated to match new FetchVersionFileOutput interface; residual commented-out code was removed. rust/storage/src/lib.rs: Added Debug implementation for Storage; otherwise minimal change. Cargo files: Dependency updates are precise and necessary for new features. rust/garbage_collector/src/construct_version_graph_orchestrator.rs: Well-structured; uses async patterns, clear error enums, and trait-based orchestrator integration. Large, so future decomposition may help maintainability. rust/garbage_collector/src/operators/fetch_version_file.rs: Refactored for new output types; improved error reporting. Debug implementations and API patterns follow Rust conventions. rust/garbage_collector/src/operators/fetch_lineage_file.rs: Clean, idiomatic, covers code and decode paths. Good use of error enums. Best PracticesDependency Management: Documentation: Error Handling: Code Modularity: Testing: Potential Issues• If collection or version lineage is partially missing, logic may terminate with error or skip nodes; correctness under partial data should be monitored. This summary was automatically generated by @propel-code-bot |
rust/garbage_collector/src/construct_version_graph_orchestrator.rs
Outdated
Show resolved
Hide resolved
rust/garbage_collector/src/construct_version_graph_orchestrator.rs
Outdated
Show resolved
Hide resolved
6ad39d0
to
93be17d
Compare
fafe998
to
ef973c9
Compare
93be17d
to
ea366a5
Compare
ef973c9
to
bdc98f5
Compare
ea366a5
to
edf6743
Compare
bdc98f5
to
8137f4d
Compare
8137f4d
to
14eb5e7
Compare
6e6f97f
to
d62f117
Compare
5801be6
to
90ac80a
Compare
d62f117
to
cb3e44a
Compare
90ac80a
to
2e26e8a
Compare
cb3e44a
to
1242760
Compare
7344e5c
to
c3a7502
Compare
1242760
to
1a374a3
Compare
c3a7502
to
2425e04
Compare
1a374a3
to
f4a6ed4
Compare
2425e04
to
a09392f
Compare
f4a6ed4
to
cdff65d
Compare
let output = match self.ok_or_terminate(message.into_inner(), ctx).await { | ||
Some(output) => output, | ||
None => { | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: should also tracing::error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok_or_terminate()
will log any error
a09392f
to
1af74ec
Compare
cdff65d
to
3ec6e51
Compare
3ec6e51
to
797d3f7
Compare
1af74ec
to
2f4ce46
Compare
797d3f7
to
7591c20
Compare
Merge activity
|
Description of changes
Adds an orchestrator to construct the version graph for all collections in a fork tree to be used by garbage collection.
Test plan
How are these changes tested?
pytest
for python,yarn test
for js,cargo test
for rustAdded tests for new orchestrator.
Documentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?
n/a