Skip to content

fix(pm): bound install pipeline task concurrency#2849

Closed
killagu wants to merge 1 commit into
nextfrom
agent/egg-dev/8ceb9dc7
Closed

fix(pm): bound install pipeline task concurrency#2849
killagu wants to merge 1 commit into
nextfrom
agent/egg-dev/8ceb9dc7

Conversation

@killagu
Copy link
Copy Markdown
Contributor

@killagu killagu commented Apr 27, 2026

Summary

  • Bound filesystem-heavy package clone/link work with a CPU-based parallel I/O limit.
  • Track and drain install/pipeline subtasks with JoinSet so pipeline completion reflects actual spawned work.
  • Added a small bound check for the new concurrency helper.

Validation

  • cargo build --profile release-local -p utoo-pm
  • cargo test -p utoo-pm
  • cargo clippy -p utoo-pm --all-targets -- -D warnings --no-deps
  • Warm-link validation on ant-design temp checkout with populated cache: 4643 added, 3281 reused, 0 downloaded, wall 0.99s

Notes

  • Full bench/pm-bench-phases.sh with ant-design Phase 0 was started with BENCH_RUNS=1 PM_LIST=utoo, but the cold install did not complete in a reasonable window in this runner and was interrupted after it had already materialized thousands of node_modules/cache entries.
  • Full-workspace cargo clippy --all-targets -- -D warnings --no-deps is blocked in this environment by unrelated openssl-sys requiring pkg-config/OpenSSL for workspace crates outside utoo-pm.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a mechanism to limit parallel I/O operations during package installation and pipeline processing by utilizing tokio::task::JoinSet and a new parallel_io_limit utility. This change aims to reduce scheduler churn and manage resource usage more effectively. A review comment identifies that the join_one helper in the pipeline worker ignores task results, potentially swallowing panics, and suggests propagating them to ensure unrecoverable bugs are properly surfaced.

Comment on lines +8 to +10
async fn join_one(tasks: &mut tokio::task::JoinSet<()>) {
let _ = tasks.join_next().await;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The join_one helper currently ignores the result of join_next(), which means any panics occurring in the pipeline subtasks will be silently swallowed. According to the general rules, panics should be treated as unrecoverable bugs and not be ignored. It is recommended to check if the task panicked and propagate it using resume_unwind to ensure that bugs in the pipeline workers are surfaced.

async fn join_one(tasks: &mut tokio::task::JoinSet<()>) {
    if let Some(Err(e)) = tasks.join_next().await {
        if e.is_panic() {
            std::panic::resume_unwind(e.into_panic());
        }
    }
}
References
  1. Do not implement recovery logic for panics. Panics should be treated as unrecoverable bugs that need to be fixed, not as transient, recoverable errors.

@elrrrrrrr
Copy link
Copy Markdown
Contributor

Closing as stale: this draft is a one-off agent experiment from 2026-04-27 with no follow-up, and overlaps with sibling PRs exploring the same optimization. Reopen if revisited.

@elrrrrrrr elrrrrrrr closed this May 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants