Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Nov 24, 2025

Problem

Bulk deletion of 300K+ workflow instances causes SQL Server timeouts (30s default) and application crashes. The current implementation loads all instance summaries into memory and executes a single DELETE statement.

Changes

Configuration

  • Added ManagementOptions.BulkDeleteBatchSize (default: 1000) to control deletion batch size

Batched Deletion Algorithm

  • Refactored WorkflowInstanceManager.BulkDeleteAsync to process deletions in configurable batches
  • Each iteration fetches only IDs (not full summaries) for next batch using Offset=0 (records shift up post-deletion)
  • Deletes batch via DeleteAsync with filtered ID list
  • Maintains existing notification events per batch
  • Continues until no matching records remain

Testing

  • Added unit tests covering multi-batch, single-batch, and empty scenarios

Configuration Example

services.Configure<ManagementOptions>(options =>
{
    options.BulkDeleteBatchSize = 500; // Adjust for DB performance
});

Impact

  • Prevents SQL timeouts by limiting DELETE statement scope
  • Reduces memory footprint: ~1KB per 1000 IDs vs MB for summaries
  • Backward compatible: existing code works without changes
  • Observable: added logging at Information and Debug levels
Original prompt

This section details on the original issue you should resolve

<issue_title>【Retention】 超时导致程序崩溃。</issue_title>
<issue_description>I use sqlserver to store the workflow instances and currently have a timer that fires 5 child workflows every second, producing over 300,000 workflow instances in about 3 days. The background setup executes every 1 hours to clean up the workflow that was completed seven days ago, but on the third day, a timeout exception will be raised, which will eventually cause the application to crash.

The current version is 3.5.1.

Image

Below is a screenshot of the system event log:

Image Image

Image</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.


This change is Reviewable

Copilot AI changed the title [WIP] Fix timeout exception causing application crash Fix: Implement batched deletion to prevent SQL timeouts with large workflow instance datasets Nov 24, 2025
Copilot AI requested a review from sfmskywalker November 24, 2025 06:14
Copilot finished work on behalf of sfmskywalker November 24, 2025 06:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants