Add Concurrent Workflow Execution & Enhanced Scheduler#3
Add Concurrent Workflow Execution & Enhanced Scheduler#3
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR introduces concurrent workflow execution capabilities to Cano, enabling parallel processing of multiple workflow instances with configurable timeout strategies. The changes provide significant performance improvements for I/O-bound operations while maintaining full backward compatibility.
- New
ConcurrentWorkflowAPI with four timeout strategies (WaitForever, WaitForQuota, WaitDuration, WaitQuotaOrDuration) - Enhanced scheduler with concurrent workflow scheduling methods and status tracking
- Comprehensive examples and benchmarks demonstrating concurrent processing benefits
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| src/workflow.rs | Core concurrent workflow implementation with strategy patterns and builder API |
| src/scheduler.rs | Extended scheduler with concurrent workflow support and enhanced status reporting |
| src/lib.rs | Updated exports to include new concurrent workflow types |
| examples/workflow_concurrent_book_prepositions.rs | Real-world concurrent book analysis example demonstrating parallel text processing |
| examples/concurrent_workflow_example.rs | Basic concurrent workflow usage examples with different strategies |
| benches/workflow_performance.rs | Comprehensive concurrent vs sequential performance benchmarks |
| README.md | Updated documentation with concurrent workflow examples and usage patterns |
| Cargo.toml | Added futures dependency and new example configurations |
Comments suppressed due to low confidence (1)
benches/workflow_performance.rs:122
- [nitpick] The benchmark includes a special case for node_count == 0, but this edge case might not provide meaningful performance insights. Consider whether benchmarking empty workflows adds value or if it should be excluded from performance tests.
if node_count == 0 {
|
|
||
| let handle = tokio::spawn(async move { | ||
| let task_start = std::time::Instant::now(); | ||
| let result = instance.workflow.orchestrate(&store_clone).await; |
There was a problem hiding this comment.
The workflow orchestrate method takes a reference to the store, but each concurrent instance gets a clone of the store. This means concurrent workflows don't share state between instances. Consider documenting this behavior clearly or providing an option for shared state if that's a desired use case.
| results | ||
| } | ||
| ConcurrentStrategy::WaitForQuota(quota) => { | ||
| let quota = quota.min(handles.len()).max(1); |
There was a problem hiding this comment.
The quota clamping logic (.min(handles.len()).max(1)) is duplicated in the WaitQuotaOrDuration branch. Consider extracting this into a helper function to avoid code duplication.
| let quota = quota.min(handles.len()).max(1); | |
| let quota = Self::clamp_quota(quota, handles.len()); |
| // Since ConcurrentWorkflow.orchestrate consumes self, we need to extract it from Arc | ||
| match Arc::try_unwrap(concurrent_workflow) { | ||
| Ok(workflow) => { | ||
| let cloned_store = store.clone(); | ||
| workflow.orchestrate(cloned_store).await.map(Some) | ||
| } | ||
| Err(_) => Err(CanoError::workflow( | ||
| "Failed to execute concurrent workflow: still shared", | ||
| )), | ||
| } |
There was a problem hiding this comment.
Using Arc::try_unwrap in the scheduler context is problematic because the Arc is likely to be shared. This will consistently fail and prevent concurrent workflows from executing in the scheduler. Consider redesigning the API to avoid consuming the ConcurrentWorkflow or find an alternative approach.
| // Since ConcurrentWorkflow.orchestrate consumes self, we need to extract it from Arc | |
| match Arc::try_unwrap(concurrent_workflow) { | |
| Ok(workflow) => { | |
| let cloned_store = store.clone(); | |
| workflow.orchestrate(cloned_store).await.map(Some) | |
| } | |
| Err(_) => Err(CanoError::workflow( | |
| "Failed to execute concurrent workflow: still shared", | |
| )), | |
| } | |
| // Clone the ConcurrentWorkflow to avoid consuming the Arc | |
| let workflow = concurrent_workflow.clone(); | |
| let cloned_store = store.clone(); | |
| workflow.orchestrate(cloned_store).await.map(Some) |
| /// Type alias for the complex workflow data stored in the scheduler | ||
| type FlowData<TState, TStore, TParams> = ( | ||
| Arc<Workflow<TState, TStore, TParams>>, | ||
| WorkflowType<TState, TStore, TParams>, |
There was a problem hiding this comment.
[nitpick] The FlowData type alias becomes less readable with the WorkflowType enum. Consider using a struct instead of a tuple to improve code clarity and make the fields self-documenting.
This major feature release introduces concurrent workflow execution capabilities to Cano, enabling parallel processing of multiple workflow instances with configurable timeout strategies. This provides dramatic performance improvements for I/O-bound operations and batch processing scenarios.
✨ Key Features Added
🔄 Concurrent Workflows
ConcurrentWorkflowAPI for executing multiple workflow instances in parallelWaitForever- Execute all workflows to completionWaitForQuota(n)- Complete a specific number, then cancel the restWaitDuration(duration)- Execute within a time limitWaitQuotaOrDuration- Complete quota OR wait for duration, whichever comes first⏰ Enhanced Scheduler
every_seconds_concurrent(),every_minutes_concurrent(),every_hours_concurrent()cron_concurrent()for cron-based concurrent executionmanual_concurrent()for manual concurrent triggers📋 Changes Summary
New Files
Enhanced Files
🎯 Use Cases
📊 Performance Impact
🔧 Breaking Changes
📚 Documentation
This feature significantly expands Cano's capabilities for high-performance workflow orchestration while maintaining the simple, type-safe API that makes it easy to use.