Making RLM subqueries run in parallel  for faster execution

# Problem

When the parent RLM fans out multiple independent queries via rlm_query_batched (e.g., "answer these 3 questions"), each child RLM runs sequentially - the second child waits for the first to fully complete before starting. For N subcalls each taking T seconds, total wall time is N*T.

This is wasteful because the children are independent: they have separate prompts, separate REPL environments, and make separate LLM API calls. The bottleneck is I/O-bound (waiting for API responses), making this an ideal candidate for thread-based parallelism.

# Proposed solution

For multiple subcalls (>1), use `ThreadPoolExecutor` and collect results of each subcall in the original order. Use concurrency control and semaphores to cap max number of parallel threads at each level and per recursion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making RLM subqueries run in parallel for faster execution #135

Problem

Proposed solution

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Making RLM subqueries run in parallel for faster execution #135

Description

Problem

Proposed solution

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions