O(N²) compile-time in dynamic-dispatch specialization (specializeModule)

## Issue Description

Compiling code that dispatches an interface with `N` implementations through a runtime-typed
existential takes time **quadratic in `N`** during specialization. The `specializeModule` phase
(visible via `-report-perf-benchmark`) grows roughly 4× each time `N` doubles. This regressed
around release v2026.7 and had not recovered to the v2026.5 baseline.

Tracking benchmark graph https://shader-slang.org/slang-compile-perf/workloads/dynamic_dispatch.html.

## Reproducer Code

An interface with several implementations selected at runtime (so static specialization cannot
collapse the dispatch). The shape below compiles as-is; the quadratic shows up as the number of
implementations grows (the `tools/compile-perf` suite generates exactly this as its
`dynamic_dispatch` workload, scaled by `N`).

```hlsl
RWStructuredBuffer<float> outBuf;

[anyValueSize(16)]
interface IShape { float eval(float x); }

struct S0 : IShape { float eval(float x) { return x * 1.0 + sin(x); } }
struct S1 : IShape { float eval(float x) { return x * 2.0 + sin(x * 2.0); } }
struct S2 : IShape { float eval(float x) { return x * 3.0 + sin(x * 3.0); } }
struct S3 : IShape { float eval(float x) { return x * 4.0 + sin(x * 4.0); } }
// ... scale this up to N implementations ...

float dispatch(int id, float x)
{
    IShape s;
    switch (id)
    {
    case 0:  s = S0(); break;
    case 1:  s = S1(); break;
    case 2:  s = S2(); break;
    case 3:  s = S3(); break;
    default: s = S0(); break;
    }
    return s.eval(x);
}

[shader("compute")]
[numthreads(1, 1, 1)]
void computeMain(uint3 tid : SV_DispatchThreadID)
{
    float acc = 0.0;
    for (int i = 0; i < 4; ++i)
        acc += dispatch((int(tid.x) + i) % 4, outBuf[0] + float(i));
    outBuf[0] = acc;
}
```

**Command:**

```bash
slangc dynamic_dispatch.slang -target spirv -emit-spirv-directly -report-perf-benchmark
```

## Expected Behavior

`specializeModule` scales roughly linearly in `N` (the number of implementations).

## Actual Behavior

`specializeModule` scales quadratically — about 4× per doubling of `N`. Measured (Apple Silicon,
RelWithDebInfo, median of 5 runs):

| N | specializeModule |
|---|---|
| 50  | 11 ms  |
| 100 | 32 ms  |
| 200 | 113 ms |
| 400 | 416 ms |

The same generator's `existential_aggregate` workload is affected similarly.

## Test Plan

- `tools/compile-perf` `dynamic_dispatch` (primary timer `specializeModule`) and
  `existential_aggregate` workloads.
- Regression test `tests/language-feature/dynamic-dispatch/many-impls.slang`.

---

A fix is proposed in #11760.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

O(N²) compile-time in dynamic-dispatch specialization (specializeModule) #11776

Issue Description

Reproducer Code

Expected Behavior

Actual Behavior

Test Plan

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

O(N²) compile-time in dynamic-dispatch specialization (specializeModule) #11776

Description

Issue Description

Reproducer Code

Expected Behavior

Actual Behavior

Test Plan

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions