We are observing an issue where subworkflows created via FORK_JOIN_DYNAMIC can stall indefinitely after completing an HTTP task. The HTTP task completes successfully (status 200 with expected payload), but the next task is never scheduled. As a result, the subworkflow remains in a RUNNING state indefinitely, and the parent workflow is also blocked. A manual intervention (pausing and restarting the subworkflow from the Conductor UI) allows the workflow to resume and complete normally.
{
"createTime": 0,
"updateTime": 1778782683427,
"name": "postAdhocTest_On_Input_OR",
"description": "workflow to fire adhoc test on single Optical Route.",
"version": 2,
"tasks": [
{
"name": "check_adhoc_test_type",
"taskReferenceName": "check_adhoc_test_type_ref",
"inputParameters": {
"adhoc_test_type": "${workflow.input.AdhocTestType}"
},
"type": "SWITCH",
"decisionCases": {
"Case2": [
{
"name": "httpApiCall",
"taskReferenceName": "postAdhoc_case2_ref",
"inputParameters": {
"http_request": {
"connectionTimeOut": 90000,
"readTimeOut": 90000,
"contentType": "application/json",
"uri": "so internal uri",
"headers": {
"x-Token-Roles": "${workflow.input.UserRoles}",
"x-Token-Username": "${workflow.input.UserName}",
"x-Token-Fms-Scope": "${workflow.input.UserScope}"
},
"body": "${workflow.input.Payload.payLoad}",
"method": "POST"
}
},
"type": "HTTP",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": true,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
},
{
"name": "set_PostAdhoc_case2_output",
"taskReferenceName": "set_PostAdhoc_case2_output_ref",
"inputParameters": {
"postAdhoc_output_status": "${postAdhoc_case2_ref.status}",
"postAdhoc_output_promiseId": "${postAdhoc_case2_ref.output.response.body}"
},
"type": "SET_VARIABLE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
}
],
"Case1": [
{
"name": "httpApiCall",
"taskReferenceName": "get_testConfig_ref",
"inputParameters": {
"http_request": {
"connectionTimeOut": 90000,
"readTimeOut": 90000,
"contentType": "application/json",
"uri": "some internal uri",
"headers": {
"x-Token-Roles": "${workflow.input.UserRoles}",
"x-Token-Username": "${workflow.input.UserName}",
"x-Token-Fms-Scope": "${workflow.input.UserScope}"
},
"method": "GET"
}
},
"type": "HTTP",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": true,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
},
{
"name": "changetestconfig_response",
"taskReferenceName": "changetestconfig_response_ref",
"inputParameters": {
"evaluatorType": "javascript",
"expression": "function e() { var testConfigPayload = ${get_testConfig_ref.output.response.body.payLoad}; testConfigPayload.WavelengthsUsed = ${workflow.input.WavelengthsUsed}; testConfigPayload.MeasurementType = \"${workflow.input.MeasurementType}\"; return JSON.stringify(testConfigPayload); } e();"
},
"type": "INLINE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
},
{
"name": "httpApiCall",
"taskReferenceName": "postAdhoc_case1_ref",
"inputParameters": {
"http_request": {
"connectionTimeOut": 90000,
"readTimeOut": 90000,
"contentType": "application/json",
"uri": "some internal uri",
"headers": {
"x-Token-Roles": "${workflow.input.UserRoles}",
"x-Token-Username": "${workflow.input.UserName}",
"x-Token-Fms-Scope": "${workflow.input.UserScope}"
},
"body": {
"name": "${workflow.input.TestConfigName}",
"payload": "${changetestconfig_response_ref.output.result}"
},
"method": "POST"
}
},
"type": "HTTP",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": true,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
},
{
"name": "set_PostAdhoc_case1_output",
"taskReferenceName": "set_PostAdhoc_case1_output_ref",
"inputParameters": {
"postAdhoc_output_status": "${postAdhoc_case1_ref.status}",
"postAdhoc_output_promiseId": "${postAdhoc_case1_ref.output.response.body}"
},
"type": "SET_VARIABLE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
}
]
},
"defaultCase": [
{
"name": "adhoc_invalid_test_type_response",
"taskReferenceName": "adhoc_invalid_test_type_response_ref",
"inputParameters": {
"evaluatorType": "javascript",
"expression": "function e() { return {\"resultId\":\"\",\"opticalRouteName\":\"${workflow.input.OpticalRouteName}\",\"opticalRouteId\":${workflow.input.OpticalRouteId},\"portId\":\"\",\"portNumber\":\"\",\"testTime\":\"\",\"status\":\"FAILED\",\"linkLength\":\"\",\"linkLoss\":\"\",\"wavelength\":\"\",\"globalStarRating\":\"\",\"orlStarRating\":\"\",\"lossStarRating\":\"\",\"globalVerdict\":\"\",\"globalDeviationVerdict\":\"\",\"completionStatus\":\"\",\"message\":\"ADHOC_CALL_FAILED\" }} e();"
},
"type": "INLINE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
},
{
"name": "error_check_postadhoc_type",
"taskReferenceName": "error_check_postadhoc_type_ref",
"inputParameters": {
"terminationStatus": "COMPLETED",
"workflowOutput": "${adhoc_invalid_test_type_response_ref.output}"
},
"type": "TERMINATE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
}
],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"evaluatorType": "value-param",
"expression": "adhoc_test_type",
"onStateChange": {},
"permissive": false
},
{
"name": "readPostAdhoc_output",
"taskReferenceName": "readPostAdhoc_output_ref",
"inputParameters": {
"value": "${workflow.variables.postAdhoc_output_status}",
"evaluatorType": "javascript",
"expression": "function e() { return $.value } e();"
},
"type": "INLINE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
},
{
"name": "check_error_post_adhoc",
"taskReferenceName": "check_error_post_adhoc_ref",
"inputParameters": {
"case_value_param": "${workflow.variables.postAdhoc_output_status}"
},
"type": "SWITCH",
"decisionCases": {
"error": [
{
"name": "adhoc_error_async_response",
"taskReferenceName": "adhoc_error_async_response_ref",
"inputParameters": {
"evaluatorType": "javascript",
"expression": "function e() { return {\"resultId\":\"\",\"opticalRouteName\":\"${workflow.input.OpticalRouteName}\",\"opticalRouteId\":${workflow.input.OpticalRouteId},\"portId\":\"\",\"portNumber\":\"\",\"testTime\":\"\",\"status\":\"FAILED\",\"linkLength\":\"\",\"linkLoss\":\"\",\"wavelength\":\"\",\"globalStarRating\":\"\",\"orlStarRating\":\"\",\"lossStarRating\":\"\",\"globalVerdict\":\"\",\"globalDeviationVerdict\":\"\",\"completionStatus\":\"\",\"message\":\"ADHOC_CALL_FAILED\" }} e();"
},
"type": "INLINE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
},
{
"name": "error_sync_response",
"taskReferenceName": "error_sync_response",
"inputParameters": {
"terminationStatus": "COMPLETED",
"workflowOutput": "${adhoc_error_async_response_ref.output}"
},
"type": "TERMINATE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
}
]
},
"defaultCase": [
{
"name": "listener",
"taskReferenceName": "listener",
"inputParameters": {
"http_request": {
"connectionTimeOut": 90000,
"readTimeOut": 90000,
"uri": "some internal api",
"method": "PUT",
"body": {
"timeout": 240000000,
"url": "some internal url",
"method": "POST",
"callbackBodyTemplateOnReceived": "{\"workflowInstanceId\": \"${workflow.workflowId}\", \"taskId\": \"${CPEWF_TASK_ID}\", \"status\": \"COMPLETED\", \"outputData\": {\"rtuResponse\": {\"headers\": {headers}, \"body\": {body}}}}",
"callbackBodyTemplateOnTimeout": "{\"workflowInstanceId\": \"${workflow.workflowId}\", \"taskId\": \"${CPEWF_TASK_ID}\", \"status\": \"FAILED\", \"outputData\": {\"resultId\":\"\",\"opticalRouteName\":\"${workflow.input.OpticalRouteName}\",\"opticalRouteId\":${workflow.input.OpticalRouteId},\"portId\":\"\",\"portNumber\":\"\",\"testTime\":\"\",\"status\":\"FAILED\",\"linkLength\":\"\",\"linkLoss\":\"\",\"wavelength\":\"\",\"globalStarRating\":\"\",\"orlStarRating\":\"\",\"lossStarRating\":\"\",\"globalVerdict\":\"\",\"globalDeviationVerdict\":\"\",\"completionStatus\":\"\",\"message\":\"ADHOC_CALL_FAILED\"}}"
}
}
},
"type": "HTTP",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": true,
"defaultExclusiveJoinTask": [],
"asyncComplete": true,
"loopOver": [],
"onStateChange": {},
"permissive": false
}
],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"evaluatorType": "javascript",
"expression": "(function() { if ($.case_value_param != 'COMPLETED') return 'error' })() ",
"onStateChange": {},
"permissive": false
},
{
"name": "check_error_async_response",
"taskReferenceName": "check_error_async_response",
"inputParameters": {
"case_value_param": "${listener.status}",
"result_type_param": "${listener.output.rtuResponse.headers.FgResultType}"
},
"type": "SWITCH",
"decisionCases": {
"error": [
{
"name": "adhoc_listener_error_async_response",
"taskReferenceName": "adhoc_listener_error_async_response_ref",
"inputParameters": {
"evaluatorType": "javascript",
"expression": "function e() { return {\"resultId\":\"\",\"opticalRouteName\":\"${workflow.input.OpticalRouteName}\",\"opticalRouteId\":${workflow.input.OpticalRouteId},\"portId\":\"\",\"portNumber\":\"\",\"testTime\":\"\",\"status\":\"FAILED\",\"linkLength\":\"\",\"linkLoss\":\"\",\"wavelength\":\"\",\"globalStarRating\":\"\",\"orlStarRating\":\"\",\"lossStarRating\":\"\",\"globalVerdict\":\"\",\"globalDeviationVerdict\":\"\",\"completionStatus\":\"\",\"message\":\"ADHOC_CALL_FAILED\" }} e();"
},
"type": "INLINE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
},
{
"name": "error_async_response",
"taskReferenceName": "error_async_response",
"inputParameters": {
"terminationStatus": "COMPLETED",
"workflowOutput": "${adhoc_listener_error_async_response_ref.output}"
},
"type": "TERMINATE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
}
]
},
"defaultCase": [
{
"name": "extractResultId",
"taskReferenceName": "extractResultId_ref",
"inputParameters": {
"value": "${listener.output.rtuResponse.body}",
"evaluatorType": "javascript",
"expression": "(function(){var parsed=JSON.parse($.value);var linkResults=parsed.brief.LinkResults.Results;return{resultId:parsed.resultid,opticalRouteName:parsed.metadata.AssetName,opticalRouteId:parsed.metadata.AssetId,portId:parsed.metadata.PortId,portNumber:parsed.metadata.PortId,testTime:parsed.metadata.TestTime,status:\"COMPLETED\",linkLength:parsed.brief.LinkResults.Length,linkLoss:linkResults[0].Loss,wavelength:linkResults[0].Wavelength,globalStarRating:parsed.brief.GlobalStarRating,orlStarRating:parsed.brief.LinkResults.OrlStarRating,lossStarRating:parsed.brief.LinkResults.LossStarRating,linkResults:linkResults.map(function(result){return{linkLoss:result.Loss,wavelength:result.Wavelength}}),globalVerdict:parsed.brief.GlobalVerdict,globalDeviationVerdict:parsed.brief.Measurement.GlobalDeviationVerdict,completionStatus:parsed.brief.LinkResults.CompletionStatus,message:\"\"}})();"
},
"type": "INLINE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
},
{
"name": "terminate_with_success",
"taskReferenceName": "terminate_with_success",
"inputParameters": {
"terminationStatus": "COMPLETED",
"workflowOutput": "${extractResultId_ref.output}"
},
"type": "TERMINATE",
"decisionCases": {},
"defaultCase": [],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"onStateChange": {},
"permissive": false
}
],
"forkTasks": [],
"startDelay": 0,
"joinOn": [],
"optional": false,
"defaultExclusiveJoinTask": [],
"asyncComplete": false,
"loopOver": [],
"evaluatorType": "javascript",
"expression": "(function() { if ($.case_value_param != 'COMPLETED' || $.result_type_param.toUpperCase() == \"ERRORJSON\") return 'error' })()",
"onStateChange": {},
"permissive": false
}
],
"inputParameters": [],
"outputParameters": {},
"schemaVersion": 2,
"restartable": true,
"workflowStatusListenerEnabled": false,
"ownerEmail": "exfo@exfo.com",
"timeoutPolicy": "ALERT_ONLY",
"timeoutSeconds": 0,
"variables": {},
"inputTemplate": {},
"enforceSchema": true,
"metadata": {},
"maskedFields": []
}
The issue started appearing after upgrading Conductor from 3.19.0 to 3.21.23, and is observed after any HTTP task within a subworkflow.
This behavior is intermittent and not consistently reproducible.
When the stalled subworkflow is manually paused and resumed via the Conductor UI, the subworkflow will be able to schedule the next task normally and the workflow resumes and completes successfully.
Although we currently have a workaround, we would still like to better understand the root cause and identify a proper solution. Ideally, tasks should complete naturally without requiring manual intervention (such as pausing and resuming them through the Conductor UI).
Conductor Version
3.21.23
Brief Description
We are observing an issue where subworkflows created via FORK_JOIN_DYNAMIC can stall indefinitely after completing an HTTP task. The HTTP task completes successfully (status 200 with expected payload), but the next task is never scheduled. As a result, the subworkflow remains in a RUNNING state indefinitely, and the parent workflow is also blocked. A manual intervention (pausing and restarting the subworkflow from the Conductor UI) allows the workflow to resume and complete normally.
Definition of the Subworkflow Stall
Detail Description
The issue started appearing after upgrading Conductor from 3.19.0 to 3.21.23, and is observed after any HTTP task within a subworkflow.
In the HTTP task
postAdhoc_case1_ref, we are having isssue on our api, which makes response payload returned as a UUID in text/plain format. However, Conductor attempts to parse the response as application/json, which results in a large number of JsonParseException entries in the logs. While this behavior generates significant noise in the logs, we are not certain whether it is directly causing the workflow to stall, as most workflows still complete successfully and ignore these parsing errors.From the logs and Conductor UI:
This behavior is intermittent and not consistently reproducible.
Workaround Found
When the stalled subworkflow is manually paused and resumed via the Conductor UI, the subworkflow will be able to schedule the next task normally and the workflow resumes and completes successfully.
Although we currently have a workaround, we would still like to better understand the root cause and identify a proper solution. Ideally, tasks should complete naturally without requiring manual intervention (such as pausing and resuming them through the Conductor UI).