Skip to content

Commit 82e47f0

Browse files
Refactor: Replace unbounded thread spawning with rayon parallel iterators (#300)
Several locations spawn unbounded threads during environment discovery, which can cause thread exhaustion on systems with many conda environments or symlinks. ## Changes - **pet-conda/lib.rs**: `get_conda_environments()` now uses `par_iter()` instead of spawning one thread per path - **pet-conda/environment_locations.rs**: `get_conda_environment_paths()` uses `par_iter()` for environment enumeration - **pet-homebrew/sym_links.rs**: `get_known_symlinks()` uses `par_iter()` for symlink resolution Added `rayon = "1.11.0"` to both crates. ## Before/After ```rust // Before: unbounded thread spawning let mut threads = vec![]; for path in paths { threads.push(thread::spawn(move || get_conda_environment_info(&path, &mgr))); } // join all... // After: bounded parallel iteration via rayon paths.par_iter() .filter_map(|path| get_conda_environment_info(path, manager)) .collect() ``` `crates/pet/src/find.rs` left unchanged—it uses `thread::scope` with a fixed number of locators/directories, which is already bounded. <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>Refactor: Consider using thread pool instead of unbounded thread spawning</issue_title> > <issue_description>## Summary > Several locations spawn an unbounded number of threads during environment discovery. On systems with many Python environments (e.g., many conda envs or large PATH), this could lead to thread exhaustion or excessive context switching. > > ## Affected Locations > > ### 1. `crates/pet-conda/src/lib.rs` (lines 368-378) > ```rust > fn get_conda_environments(paths: &Vec<PathBuf>, manager: &Option<CondaManager>) -> Vec<CondaEnvironment> { > let mut threads = vec![]; > for path in paths { > let path = path.clone(); > let mgr = manager.clone(); > threads.push(thread::spawn(move || { > // ... > })); > } > // ... > } > ``` > > ### 2. `crates/pet-homebrew/src/sym_links.rs` (lines 33-55) > ```rust > let threads = symlinks > .iter() > .map(|symlink| { > std::thread::spawn(move || { > // ... > }) > }) > .collect::<Vec<_>>(); > ``` > > ### 3. `crates/pet/src/find.rs` > Multiple `thread::scope` with spawns for each path/locator. > > ## Proposed Solutions > > ### Option 1: Use `rayon` for parallel iteration > ```rust > use rayon::prelude::*; > > fn get_conda_environments(paths: &[PathBuf], manager: &Option<CondaManager>) -> Vec<CondaEnvironment> { > paths.par_iter() > .filter_map(|path| get_conda_environment_info(path, manager)) > .collect() > } > ``` > > ### Option 2: Use bounded thread pool > ```rust > use std::sync::mpsc; > use threadpool::ThreadPool; > > let pool = ThreadPool::new(num_cpus::get()); > for path in paths { > pool.execute(move || { > // ... > }); > } > pool.join(); > ``` > > ### Option 3: Use `thread::scope` with chunking > ```rust > thread::scope(|s| { > for chunk in paths.chunks(num_cpus::get()) { > s.spawn(|| { > for path in chunk { > // process path > } > }); > } > }); > ``` > > ## Benefits > - Controlled parallelism based on CPU count > - Better resource management > - Avoid thread exhaustion on systems with hundreds of environments > - `rayon` provides work-stealing for better load balancing > > ## Considerations > - `rayon` adds a dependency but is widely used and well-maintained > - `thread::scope` (used in many places already) is good for structured concurrency > - The impact depends on typical environment counts > > ## Priority > Low - Current implementation works but could cause issues at scale.</issue_description> > > <agent_instructions>Fix the issue. See if the recommendation makes sense and create a fix.</agent_instructions> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > </comments> > </details> <!-- START COPILOT CODING AGENT SUFFIX --> - Fixes #288 <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: karthiknadig <[email protected]>
1 parent 72a6b2e commit 82e47f0

File tree

6 files changed

+85
-57
lines changed

6 files changed

+85
-57
lines changed

Cargo.lock

Lines changed: 53 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

crates/pet-conda/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ regex = "1.10.4"
1919
pet-reporter = { path = "../pet-reporter" }
2020
env_logger = "0.10.2"
2121
yaml-rust2 = "0.8.1"
22+
rayon = "1.11.0"
2223

2324
[features]
2425
ci = []

crates/pet-conda/src/environment_locations.rs

Lines changed: 6 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ use crate::{
99
use log::trace;
1010
use pet_fs::path::{expand_path, norm_case};
1111
use pet_python_utils::platform_dirs::Platformdirs;
12+
use rayon::prelude::*;
1213
use std::{
1314
env, fs,
1415
path::{Path, PathBuf},
@@ -43,19 +44,11 @@ pub fn get_conda_environment_paths(
4344
env_paths.dedup();
4445
// For each env, check if we have a conda install directory in them and
4546
// & then iterate through the list of envs in the envs directory.
46-
// let env_paths = vec![];
47-
let mut threads = vec![];
48-
for path in env_paths.iter().filter(|f| f.exists()) {
49-
let path = path.clone();
50-
threads.push(thread::spawn(move || get_environments(&path)));
51-
}
52-
53-
let mut result = vec![];
54-
for thread in threads {
55-
if let Ok(envs) = thread.join() {
56-
result.extend(envs);
57-
}
58-
}
47+
let mut result: Vec<PathBuf> = env_paths
48+
.par_iter()
49+
.filter(|f| f.exists())
50+
.flat_map(|path| get_environments(path))
51+
.collect();
5952

6053
result.sort();
6154
result.dedup();

crates/pet-conda/src/lib.rs

Lines changed: 5 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ use pet_core::{
1818
Locator, LocatorKind,
1919
};
2020
use pet_fs::path::norm_case;
21+
use rayon::prelude::*;
2122
use serde::{Deserialize, Serialize};
2223
use std::{
2324
collections::HashMap,
@@ -380,24 +381,8 @@ fn get_conda_environments(
380381
paths: &Vec<PathBuf>,
381382
manager: &Option<CondaManager>,
382383
) -> Vec<CondaEnvironment> {
383-
let mut threads = vec![];
384-
for path in paths {
385-
let path = path.clone();
386-
let mgr = manager.clone();
387-
threads.push(thread::spawn(move || {
388-
if let Some(env) = get_conda_environment_info(&path, &mgr) {
389-
vec![env]
390-
} else {
391-
vec![]
392-
}
393-
}));
394-
}
395-
396-
let mut envs: Vec<CondaEnvironment> = vec![];
397-
for thread in threads {
398-
if let Ok(mut result) = thread.join() {
399-
envs.append(&mut result);
400-
}
401-
}
402-
envs
384+
paths
385+
.par_iter()
386+
.filter_map(|path| get_conda_environment_info(path, manager))
387+
.collect()
403388
}

crates/pet-homebrew/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,3 +18,4 @@ lazy_static = "1.4.0"
1818
pet-core = { path = "../pet-core" }
1919
log = "0.4.21"
2020
regex = "1.10.4"
21+
rayon = "1.11.0"

crates/pet-homebrew/src/sym_links.rs

Lines changed: 19 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
use lazy_static::lazy_static;
55
use pet_fs::path::resolve_symlink;
66
use pet_python_utils::executable::find_executables;
7+
use rayon::prelude::*;
78
use regex::Regex;
89
use std::{
910
fs,
@@ -30,32 +31,26 @@ pub fn get_known_symlinks(
3031
// Go through all the exes in all of the above bin directories and verify we have a list of all of them.
3132
// They too could be symlinks, e.g. we could have `/opt/homebrew/bin/python3` & also `/opt/homebrew/bin/python`
3233
// And possible they are all symlnks to the same exe.
33-
let threads = symlinks
34-
.iter()
35-
.map(|symlink| {
36-
let symlink = symlink.clone();
37-
let known_symlinks = symlinks.clone();
38-
std::thread::spawn(move || {
39-
if let Some(bin) = symlink.parent() {
40-
let mut symlinks = vec![];
41-
for possible_symlink in find_executables(bin) {
42-
if let Some(symlink) = resolve_symlink(&possible_symlink) {
43-
if known_symlinks.contains(&symlink) {
44-
symlinks.push(possible_symlink);
45-
}
34+
let known_symlinks = symlinks.clone();
35+
let other_symlinks: Vec<PathBuf> = symlinks
36+
.par_iter()
37+
.flat_map(|symlink| {
38+
if let Some(bin) = symlink.parent() {
39+
find_executables(bin)
40+
.into_iter()
41+
.filter(|possible_symlink| {
42+
if let Some(resolved) = resolve_symlink(possible_symlink) {
43+
known_symlinks.contains(&resolved)
44+
} else {
45+
false
4646
}
47-
}
48-
symlinks
49-
} else {
50-
vec![]
51-
}
52-
})
47+
})
48+
.collect::<Vec<_>>()
49+
} else {
50+
vec![]
51+
}
5352
})
54-
.collect::<Vec<_>>();
55-
let other_symlinks = threads
56-
.into_iter()
57-
.flat_map(|t| t.join().unwrap())
58-
.collect::<Vec<_>>();
53+
.collect();
5954
symlinks.extend(other_symlinks);
6055

6156
symlinks.sort();

0 commit comments

Comments
 (0)