Open
Description
### Summary
We need a robust way to handle cancellations (or updates) in the parallel builder, where drain_new_orders() returns None. Because sub-components (ConflictFinder
, ResultsAggregator
, etc.) store partial merges and simulation states, a single order removal is insufficient. Instead, we must fully reset them all, wiping stale references. However, the builder also spawns threads (worker pool, aggregator, block-building), which complicates concurrency.
Goals
- Full Reset: When cancellations occur, we must clear:
- ConflictFinder
- ConflictTaskGenerator's
existing_groups
/task_queue
- ConflictResolvingPool's
task_queue
- SimulationCache
- ResultsAggregator’s best_results
Then re-ingest only orders and let the system proceed as if new (e.g. find conflicts, resolve them, build blocks).
- Thread Safety: Ensure no data races or partial updates if aggregator/block-building threads hold references.
- Maintain High Performance: Avoid large overhead from locking or repeated resets.
Core Considerations
- Reset Mechanism: All submodules (conflict finder, aggregator, simulation cache, etc) need a consistent way to clear and re-ingest orders—no partial or stale references.
- Concurrency Approach:
*. Single-Thread Manager: Central struct (ParallelBuilder
) mutates data, with worker threads taking only short-lived snapshots. Simple to reset, but limits fully parallel reads.
*. Multi-Thread Shared State: Modules live behind locks (Arc<Mutex<...>>) or receive message-based commands. More flexible for continuous concurrency, but demands locking or event loops. - Data Usage Pattern: Decide if aggregator/block-building hold references 24/7 (requiring concurrency control) or just use ephemeral tasks (easier single-thread reset).
- Performance vs. Simplicity: Locking or message-passing ensures safety for truly parallel modules, but increases complexity and overhead. A single-thread manager is simpler but less concurrent.
Please weigh in on the core considerations.