[sdk] dsl.ParallelFor name= breaks dsl.Collected at runtime


### Environment

* **KFP SDK version:** `kfp==2.14.3`, `kfp-kubernetes==2.14.3`
* **KFP backend:** Open-source Kubeflow Manifests on AWS EKS (Argo Workflows executor)
* **All dependencies version:**
```
kfp                                2.14.3
kfp-kubernetes                     2.14.3
kfp-pipeline-spec                  2.14.0
kfp-server-api                     2.16.0
```

### Summary

Passing `name=` to `dsl.ParallelFor` causes a runtime failure when downstream tasks consume `dsl.Collected` outputs from the loop. The KFP Argo driver cannot resolve the producer task because the compiled pipeline IR has inconsistent identifiers for the `ParallelFor` group.

### Steps to reproduce

1. Create a pipeline that uses `dsl.ParallelFor` with a custom `name=` parameter and `dsl.Collected` to fan-in outputs:

```python
from kfp import compiler, dsl

@dsl.component
def echo(item: str) -> str:
    return item

@dsl.component
def collect(items: list) -> str:
    return str(items)

@dsl.pipeline(name="repro-parallelfor-name-bug")
def my_pipeline():
    with dsl.ParallelFor(items=["a", "b", "c"], name="My Custom Loop") as item:
        work = echo(item=item)
    collect(items=dsl.Collected(work.output))

compiler.Compiler().compile(my_pipeline, "pipeline.yaml")
```

2. Upload and run the pipeline on a KFP deployment using the Argo Workflows executor.

3. The `collect` step fails at the driver level with:

```
F0408 16:10:52.114093  16 main.go:104] KFP driver: failed to resolve inputs:
  failed to resolve input artifact input_paths with spec
  task_output_artifact:{producer_task:"for-loop-2" output_artifact_key:"..."}: 
  cannot find producer task "for-loop-2_19426"
```

4. **Removing `name="My Custom Loop"` from the `dsl.ParallelFor` call makes the pipeline succeed.**

### Root cause

The compiler (`pipeline_spec_builder.build_task_spec_for_group`) sets `taskInfo.name` from the user-provided display name:

```python
# kfp/compiler/pipeline_spec_builder.py
pipeline_task_spec.task_info.name = group.display_name or group.name
```

But all cross-task references — the DAG dictionary key, `componentRef.name`, `dependentTasks`, and `producerTask` — are derived from `group.name` (the auto-generated `for-loop-N`).

This produces an internally inconsistent pipeline IR. Comparing the compiled YAML with and without `name=`, the **only** difference is:

```yaml
# Without name= (works):
      taskInfo:
        name: for-loop-2

# With name= (breaks):
      taskInfo:
        name: My Custom Loop
```

All other references (`dependentTasks: [for-loop-2]`, `producerTask: for-loop-2`, `componentRef.name: comp-for-loop-2`, etc.) remain unchanged. The Argo driver uses `taskInfo.name` as the runtime task identifier when registering DAG executions in MLMD, so when a downstream task looks up `producerTask: "for-loop-2"`, it can't find it because the loop execution was registered under `"My Custom Loop"`.

### Expected result

Passing `name=` to `dsl.ParallelFor` should produce a self-consistent pipeline IR and work correctly at runtime. Either:

1. `taskInfo.name` should always equal `group.name` (the task key), with the custom name stored in a separate display-name field, **or**
2. The compiler should propagate the custom name consistently into all cross-references (`dependentTasks`, `producerTask`, `componentRef.name`, and the DAG tasks dictionary key).

### Materials and Reference

- The `name=` parameter on `dsl.ParallelFor` is accepted without error at compile time, producing valid YAML that only fails at runtime — making this hard to catch.
- `dsl.If` / `dsl.ExitHandler` also accept `name=` via the same `TasksGroup` base class. The `ExitHandler` works correctly because the driver handles its execution registration differently. `dsl.If`/`dsl.Else` may or may not be affected (untested).
- The bug is in the SDK compiler, not the backend driver — the driver is reasonably using `taskInfo.name` from the spec it was given.

---

Impacted by this bug? Give it a 👍.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[sdk] dsl.ParallelFor name= breaks dsl.Collected at runtime #13239

Environment

Summary

Steps to reproduce

Root cause

Expected result

Materials and Reference

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[sdk] dsl.ParallelFor name= breaks dsl.Collected at runtime #13239

Description

Environment

Summary

Steps to reproduce

Root cause

Expected result

Materials and Reference

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions