Open
Description
Expected Behavior
Starting a workflow with the following parameters should either succeed or terminate the existing one as specified by the conflict policy:
- id_reuse_policy: ALLOW_DUPLICATE
- execution_timeout: 12h
- id_conflict_policy: TERMINATE_EXISTING
Actual Behavior
The gRPC call start_workflow_execution fails after multiple retries with the following error:
WARN temporal_client::retry: gRPC call start_workflow_execution retried 7 times error=Status { code: Unavailable, message: "createOrUpdateCurrentExecution failed. Failed to insert into current_executions table. Error: pq: duplicate key value violates unique constraint \"current_executions_pkey\"", metadata: MetadataMap { headers: {"content-type": "application/grpc"} }, source: None }
This suggests that a previous workflow with the same ID was not properly cleaned up, possibly due to a failed rollback transaction in Temporal, resulting in data inconsistency in the current_executions table.
Manual deletion of the row in the database resolves the issue:
DELETE FROM current_executions WHERE workflow_id = 'problem-id';
Steps to Reproduce the Problem
I haven't found a way to reproduce the problem, but the process looks like this:
- Attempt to start a workflow with the given parameters (ALLOW_DUPLICATE, TERMINATE_EXISTING, long timeout).
- Observe retries and eventual failure with the duplicate key value violates unique constraint error.
- Verify that the conflicting workflow ID still exists in current_executions table.
- Manually delete the row to allow the new workflow to start successfully.
Specifications
- Version: temporalio (python sdk) 1.10.0
- Platform: Temporal Version 1.27.1 (PostgreSQL).