Fix: Improve Task Cleanup and Immediate Termination in Workflow Stop (#573)#581
Fix: Improve Task Cleanup and Immediate Termination in Workflow Stop (#573)#581saikiranAnnam wants to merge 1 commit intoESIPFed:mainfrom
Conversation
| tasksToRemove.add(thet); // Mark for removal | ||
| } | ||
| } | ||
| waitinglist.removeAll(tasksToRemove); // Remove all tasks at once after iteration |
There was a problem hiding this comment.
If we remove all that at once, what is the difference from the previous one exactly?
There was a problem hiding this comment.
In the previous version of code - Removing During Iteration.
revised code - Removal after Iteration.
Removing During Iteration vs. After Iteration:
Previous Version: tasks are removed from the waitinglist while iterating over it. This can cause problems with the iteration itself, especially if you remove items during the iteration process.
Issue with Direct Removal During Iteration: When you remove a task from the list, the iterator moves forward to the next element, but it skips checking the element that comes immediately after the removed task. This happens because after removal, the next element shifts into the current index position, and the iterator’s next() method skips over it. This can cause missing tasks that should have been processed.
Current Version: The updated code collects tasks to be removed in a separate List tasksToRemove. Once the iteration is complete, the tasks to be removed are all removed in a single operation with waitinglist.removeAll(tasksToRemove).
Advantages: By collecting tasks to be removed first and then performing the removal after the iteration, you avoid modifying the list during iteration, preventing skipping tasks and ensuring that all matching tasks are processed correctly.
| executeATask(newtask); | ||
| waitinglist.remove(newtask); // Remove task from waiting list | ||
| if (newtask.checkShouldPassOrNot()) { | ||
| newtask.endPrematurely(); // End the task immediately if it should not proceed |
There was a problem hiding this comment.
you removed the notifyWaitinglist function, please explain why? how does the queue know that this waiting list is changed for worker to pick up new one?
| * operation | ||
| */ | ||
| private void stopRunningOperations() { | ||
| if (this.curstatus == ExecutionStatus.STOPPED) { |
There was a problem hiding this comment.
It seems this function does nothing. Why is it needed?
| } | ||
|
|
||
| // Ensure the task is not performing any further operations | ||
| stopRunningOperations(); |
There was a problem hiding this comment.
The function doesn't do anything as of now, I just wrote this for an edge case(for the future case) if the processes come under these three cases:
- Long-running Loops
- I/O Operations: reading from a file or waiting for network data
- Timers or Delayed Tasks
But the inside logic can be written for the future case
| this.curstatus = ExecutionStatus.STOPPED; | ||
|
|
||
| // If the task is running in a separate thread, interrupt the thread | ||
| if (Thread.currentThread().isInterrupted()) { |
There was a problem hiding this comment.
This is very weird and risky way to stop current thread. Please explain why it won't kill geoweaver itself.
There was a problem hiding this comment.
Interrupting the Current Thread Only Affects the Task's Thread:
- Thread.currentThread().interrupt() - interrupts the currently executing thread, which means it only affects the task that is running in that thread.
- Calling interrupt() does not kill or forcefully stop the thread; it merely sets the interrupt flag of the thread.
- Geoweaver manages its worker threads in a thread pool. Interrupting one worker thread (through Thread.currentThread().interrupt()) will not bring down the entire system.
It only affects the thread executing the task, not the remaining threads.
ZihengSun
left a comment
There was a problem hiding this comment.
Have a lot of questions. You need to throughly test this, please design a comprehensive test case, as you are changing the core module of Geoweaver. There are cases that you might haven't think of. Please make sure fixing one bug raises hundreds more.
|
|
||
| runninglist.remove(thet); // remove from waiting list | ||
| // Stop tasks in running list | ||
| synchronized (runninglist) { |
There was a problem hiding this comment.
The synchronized (runninglist) block is necessary to ensure that no other thread can modify the runninglist while it is being iterated (i.e when stop operation in progress), preventing concurrency issues and ensuring no data loss. This is crucial in multi-threaded environments where multiple threads might interact with shared data.
|
I will be testing this thoroughly with different test cases. I might miss some other cases i have missed off. |
|
I tested locally, the issue wasn't fixed. After I click the Stop button in the Weaver, the ongoing tasks are still ongoing. The issue is still there. |
you might only test the short lived process. Try some long running tasks and try to stop them from the Weaver stop button while the workflow is running. I don't see the issue is gone. The current fix doesn't work. |
|
You should learn more about how the |
|
I will be testing with long processes and also learn more about the stop process in the history table. And will get back on this with a new PR |
This PR addresses the issue where the "Stop Workflow" button didn't properly clean up the task queue or stop tasks immediately. Key improvements include:
Immediate Task Termination: The end prematurely () method now interrupts the task’s thread (if applicable) and ensures that long-running operations are stopped promptly by checking the task’s status.
Task Queue Cleanup: When a task is stopped, the waiting and running lists are properly cleaned up. Tasks are marked as STOPPED and removed from the queues immediately.
Enhanced Monitoring: The stopMonitor() method ensures tasks on the waiting list are processed and stopped immediately.
These changes ensure that tasks are correctly cleaned up and stopped at the right moment when the stop button is pressed, improving workflow responsiveness and stability.