Environment
- KFP SDK version:
kfp==2.14.3, kfp-kubernetes==2.14.3
- KFP backend: Open-source Kubeflow Manifests on AWS EKS (Argo Workflows executor)
- All dependencies version:
kfp 2.14.3
kfp-kubernetes 2.14.3
kfp-pipeline-spec 2.14.0
kfp-server-api 2.16.0
Summary
Passing name= to dsl.ParallelFor causes a runtime failure when downstream tasks consume dsl.Collected outputs from the loop. The KFP Argo driver cannot resolve the producer task because the compiled pipeline IR has inconsistent identifiers for the ParallelFor group.
Steps to reproduce
- Create a pipeline that uses
dsl.ParallelFor with a custom name= parameter and dsl.Collected to fan-in outputs:
from kfp import compiler, dsl
@dsl.component
def echo(item: str) -> str:
return item
@dsl.component
def collect(items: list) -> str:
return str(items)
@dsl.pipeline(name="repro-parallelfor-name-bug")
def my_pipeline():
with dsl.ParallelFor(items=["a", "b", "c"], name="My Custom Loop") as item:
work = echo(item=item)
collect(items=dsl.Collected(work.output))
compiler.Compiler().compile(my_pipeline, "pipeline.yaml")
-
Upload and run the pipeline on a KFP deployment using the Argo Workflows executor.
-
The collect step fails at the driver level with:
F0408 16:10:52.114093 16 main.go:104] KFP driver: failed to resolve inputs:
failed to resolve input artifact input_paths with spec
task_output_artifact:{producer_task:"for-loop-2" output_artifact_key:"..."}:
cannot find producer task "for-loop-2_19426"
- Removing
name="My Custom Loop" from the dsl.ParallelFor call makes the pipeline succeed.
Root cause
The compiler (pipeline_spec_builder.build_task_spec_for_group) sets taskInfo.name from the user-provided display name:
# kfp/compiler/pipeline_spec_builder.py
pipeline_task_spec.task_info.name = group.display_name or group.name
But all cross-task references — the DAG dictionary key, componentRef.name, dependentTasks, and producerTask — are derived from group.name (the auto-generated for-loop-N).
This produces an internally inconsistent pipeline IR. Comparing the compiled YAML with and without name=, the only difference is:
# Without name= (works):
taskInfo:
name: for-loop-2
# With name= (breaks):
taskInfo:
name: My Custom Loop
All other references (dependentTasks: [for-loop-2], producerTask: for-loop-2, componentRef.name: comp-for-loop-2, etc.) remain unchanged. The Argo driver uses taskInfo.name as the runtime task identifier when registering DAG executions in MLMD, so when a downstream task looks up producerTask: "for-loop-2", it can't find it because the loop execution was registered under "My Custom Loop".
Expected result
Passing name= to dsl.ParallelFor should produce a self-consistent pipeline IR and work correctly at runtime. Either:
taskInfo.name should always equal group.name (the task key), with the custom name stored in a separate display-name field, or
- The compiler should propagate the custom name consistently into all cross-references (
dependentTasks, producerTask, componentRef.name, and the DAG tasks dictionary key).
Materials and Reference
- The
name= parameter on dsl.ParallelFor is accepted without error at compile time, producing valid YAML that only fails at runtime — making this hard to catch.
dsl.If / dsl.ExitHandler also accept name= via the same TasksGroup base class. The ExitHandler works correctly because the driver handles its execution registration differently. dsl.If/dsl.Else may or may not be affected (untested).
- The bug is in the SDK compiler, not the backend driver — the driver is reasonably using
taskInfo.name from the spec it was given.
Impacted by this bug? Give it a 👍.
Environment
kfp==2.14.3,kfp-kubernetes==2.14.3Summary
Passing
name=todsl.ParallelForcauses a runtime failure when downstream tasks consumedsl.Collectedoutputs from the loop. The KFP Argo driver cannot resolve the producer task because the compiled pipeline IR has inconsistent identifiers for theParallelForgroup.Steps to reproduce
dsl.ParallelForwith a customname=parameter anddsl.Collectedto fan-in outputs:Upload and run the pipeline on a KFP deployment using the Argo Workflows executor.
The
collectstep fails at the driver level with:name="My Custom Loop"from thedsl.ParallelForcall makes the pipeline succeed.Root cause
The compiler (
pipeline_spec_builder.build_task_spec_for_group) setstaskInfo.namefrom the user-provided display name:But all cross-task references — the DAG dictionary key,
componentRef.name,dependentTasks, andproducerTask— are derived fromgroup.name(the auto-generatedfor-loop-N).This produces an internally inconsistent pipeline IR. Comparing the compiled YAML with and without
name=, the only difference is:All other references (
dependentTasks: [for-loop-2],producerTask: for-loop-2,componentRef.name: comp-for-loop-2, etc.) remain unchanged. The Argo driver usestaskInfo.nameas the runtime task identifier when registering DAG executions in MLMD, so when a downstream task looks upproducerTask: "for-loop-2", it can't find it because the loop execution was registered under"My Custom Loop".Expected result
Passing
name=todsl.ParallelForshould produce a self-consistent pipeline IR and work correctly at runtime. Either:taskInfo.nameshould always equalgroup.name(the task key), with the custom name stored in a separate display-name field, ordependentTasks,producerTask,componentRef.name, and the DAG tasks dictionary key).Materials and Reference
name=parameter ondsl.ParallelForis accepted without error at compile time, producing valid YAML that only fails at runtime — making this hard to catch.dsl.If/dsl.ExitHandleralso acceptname=via the sameTasksGroupbase class. TheExitHandlerworks correctly because the driver handles its execution registration differently.dsl.If/dsl.Elsemay or may not be affected (untested).taskInfo.namefrom the spec it was given.Impacted by this bug? Give it a 👍.