Skip to content

Conversation

@lifflander
Copy link
Contributor

Fixes #5

@lifflander lifflander linked an issue Dec 1, 2025 that may be closed by this pull request
2 tasks
@lifflander lifflander force-pushed the 5-implement-work-algorithm-in-lb branch from 558ee7d to 48ce3bb Compare December 2, 2025 19:58
@lifflander lifflander requested a review from Copilot December 3, 2025 21:14
@lifflander lifflander force-pushed the 5-implement-work-algorithm-in-lb branch from 3b22d83 to adb691c Compare December 3, 2025 21:15
Copilot finished reviewing on behalf of lifflander December 3, 2025 21:16
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds work formula computation and task-cluster summary infrastructure to the load balancer, enabling more sophisticated load balancing decisions that account for computation, communication, and memory costs.

Key Changes

  • Introduces a configurable work model with coefficients for compute load, inter-node communication, intra-node communication, and shared-memory communication
  • Adds cluster summarization functionality that aggregates task and communication information at the cluster level
  • Implements memory usage tracking and validation with breakdown by footprint, working memory, and serialized memory

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/vt-lb/algo/temperedlb/work_model.h Defines WorkModel, WorkBreakdown, MemoryBreakdown structures and WorkModelCalculator for computing work from task/communication data
src/vt-lb/algo/temperedlb/work_model.cc Implements work calculation logic including incremental updates and memory constraint checking
src/vt-lb/algo/temperedlb/configuration.h Extracts Configuration struct from temperedlb.h with memory info accessors
src/vt-lb/algo/temperedlb/cluster_summarizer.h Defines TaskClusterSummaryInfo and ClusterSummarizer for aggregating cluster-level statistics
src/vt-lb/algo/temperedlb/cluster_summarizer.cc Implements cluster summarization including load, communication, and memory metrics
src/vt-lb/algo/temperedlb/temperedlb.h Refactors TemperedLB to use new work model, adds trial-based execution and statistics computation
src/vt-lb/algo/temperedlb/visualize.h Extends visualization to track and display intra-cluster communication bytes
src/vt-lb/algo/temperedlb/clustering.h Adds utility function to verify all tasks are clustered and wraps code in namespace
src/vt-lb/model/PhaseData.h Adds rank-level memory tracking fields (footprint bytes and max memory available)
src/vt-lb/model/Task.h Reformats indentation for consistency
src/vt-lb/algo/baselb/baselb.h Adds save/restore phase data methods for trial-based execution
src/vt-lb/comm/vt/proxy_wrapper.impl.h Comments out debug printf statements
src/vt-lb/algo/temperedlb/symmetrize_comm.h Adds blank line after header comment block
examples/test_example.cc Removes polling loop from example code

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@lifflander lifflander marked this pull request as ready for review December 3, 2025 22:02
@lifflander lifflander requested a review from nlslatt December 3, 2025 22:10
@lifflander lifflander force-pushed the 5-implement-work-algorithm-in-lb branch from 74b73da to fe84f36 Compare December 3, 2025 22:13
@lifflander lifflander requested a review from Copilot December 3, 2025 22:18
Copilot finished reviewing on behalf of lifflander December 3, 2025 22:22
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 15 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@lifflander lifflander merged commit a512ffb into master Dec 3, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement work algorithm in LB

2 participants