Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
74fa8d1
plans
ccummingsNV Mar 10, 2026
9bca5ab
first attempt at phase 1
ccummingsNV Mar 10, 2026
40416d6
work on redoing direct_bind logic
ccummingsNV Mar 11, 2026
22a82b7
wip tests to figure out the clear binding info
ccummingsNV Mar 11, 2026
ec3fc1e
start switching to using the gen load+store for everything
ccummingsNV Mar 11, 2026
d8e60b5
Fix binding issues
ccummingsNV Mar 11, 2026
5f30772
Fix some tests
ccummingsNV Mar 11, 2026
f85dd1a
code cleanup
ccummingsNV Mar 11, 2026
0dedb1d
code cleanup
ccummingsNV Mar 11, 2026
bd6690f
more tensor cleanup
ccummingsNV Mar 11, 2026
d603249
extra tests
ccummingsNV Mar 11, 2026
3b8cdc5
better error
ccummingsNV Mar 11, 2026
2fd9437
fix read_output
ccummingsNV Mar 11, 2026
ba7fe5d
Merge remote-tracking branch 'origin/main' into dev/ccummings/kernelgen
ccummingsNV Mar 12, 2026
8249e4f
Updated plan
ccummingsNV Mar 12, 2026
9b32e0e
Reduce type alias use
ccummingsNV Mar 12, 2026
1566d51
Neater tests
ccummingsNV Mar 12, 2026
553795e
gate tests
ccummingsNV Mar 12, 2026
6d87391
First version of reading uniform size
ccummingsNV Mar 12, 2026
86f45fc
wip switching to entry point arguments
ccummingsNV Mar 12, 2026
c941898
wip switching to entry point arguments
ccummingsNV Mar 12, 2026
e589a3e
working reduced entry points
ccummingsNV Mar 12, 2026
4ad532e
Merge remote-tracking branch 'origin/main' into dev/ccummings/kernelgen
ccummingsNV Mar 12, 2026
fadab3c
PR cleanup
ccummingsNV Mar 13, 2026
ef17ff5
More PR fixes
ccummingsNV Mar 13, 2026
1d0c39f
Rename use_direct_args
ccummingsNV Mar 13, 2026
6c255b3
More tests
ccummingsNV Mar 13, 2026
6fdf58b
Don't use ep args on metal for now
ccummingsNV Mar 13, 2026
43a4688
Plan for code gen cleanup
ccummingsNV Mar 13, 2026
828cf2c
wip generator cleanup
ccummingsNV Mar 13, 2026
a48603f
Disable metal tests
ccummingsNV Mar 13, 2026
77b6bf6
Update verification commands in documentation to reflect correct test…
ccummingsNV Mar 13, 2026
dde4423
more extracting
ccummingsNV Mar 13, 2026
ca0a807
Merge branch 'copilot-worktree-2026-03-13T14-20-42' into dev/ccumming…
ccummingsNV Mar 13, 2026
5e24b64
Refactor code generation for call data handling
ccummingsNV Mar 13, 2026
c7126a8
Merge remote-tracking branch 'origin/dev/ccummings/kernelgen' into co…
ccummingsNV Mar 13, 2026
7a671b5
Merge branch 'copilot-worktree-2026-03-13T14-24-02' into dev/ccumming…
ccummingsNV Mar 13, 2026
49630c3
fix calldata
ccummingsNV Mar 13, 2026
457aaa2
work on generator cleanup
ccummingsNV Mar 16, 2026
1117867
More generator cleanup
ccummingsNV Mar 16, 2026
c71f4c2
more generator cleanup
ccummingsNV Mar 16, 2026
fbe7e29
Merge remote-tracking branch 'origin/main' into dev/ccummings/kernelgen
ccummingsNV Mar 16, 2026
4d22479
Cleanup use param block
ccummingsNV Mar 16, 2026
c5c13ce
Safety check
ccummingsNV Mar 16, 2026
9f418e1
wip removing trampoline
ccummingsNV Mar 16, 2026
43cb484
no trampolines
ccummingsNV Mar 17, 2026
30f48fa
Merge branch 'main' into dev/ccummings/kernelgen
ccummingsNV Mar 17, 2026
346ec6c
Remove old excessive tests, clean up proper ones + add a few more
ccummingsNV Mar 17, 2026
747a45e
Merge remote-tracking branch 'origin/main' into dev/ccummings/kernelgen
ccummingsNV Mar 17, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
208 changes: 208 additions & 0 deletions .github/prompts/plan-extractCodegenToGenerator.prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
## Extract codegen into generator.py

**Goal**: Extract the code-emission logic from [callsignature.py](slangpy/core/callsignature.py) (`generate_code`, `generate_constants`, `KernelGenException`, helpers) and `BoundVariable.gen_call_data_code` from [boundvariable.py](slangpy/bindings/boundvariable.py) into a new [generator.py](slangpy/core/generator.py) file. The new file decomposes the monolithic `generate_code` (332 lines) into clearly-named sub-functions with doc comments showing what Slang code each one emits. `callsignature.py` retains the binding-pipeline functions (`specialize`, `bind`, `calculate_*`, etc.). Each step is a pure move/rename with no behavioral changes, verifiable by the existing test suites.

**Parent plan**: [plan-simplifyKernelGenPhase2-cleanup.prompt.md](plan-simplifyKernelGenPhase2-cleanup.prompt.md)

---

### Step 1: Create `slangpy/core/generator.py` with `generate_constants` and `KernelGenException`

Move these small, self-contained pieces first:

- **Move** `KernelGenException` (lines 40–43) from [callsignature.py](slangpy/core/callsignature.py#L40-L43).
- **Move** `is_slangpy_vector` (lines 240–247) from [callsignature.py](slangpy/core/callsignature.py#L240-L247) — private helper, prefix with `_`.
- **Move** `generate_constants` (lines 250–268) from [callsignature.py](slangpy/core/callsignature.py#L250-L268).
- **In [callsignature.py](slangpy/core/callsignature.py)**: Add `from slangpy.core.generator import KernelGenException, generate_constants` and delete the moved code. Keep a re-export of `KernelGenException` so any external consumer of the wildcard import from [calldata.py](slangpy/core/calldata.py#L8) continues to work.
- **In [dispatchdata.py](slangpy/core/dispatchdata.py#L7)**: Change `from slangpy.core.callsignature import generate_constants` → `from slangpy.core.generator import generate_constants`.

**Verify**: `pytest slangpy/tests/slangpy_tests -v` — all tests pass, no import errors.

**DONE**: Created `slangpy/core/generator.py` with `KernelGenException`, `_is_slangpy_vector`, `generate_constants`. Replaced definitions in `callsignature.py` with re-exports. Updated `dispatchdata.py` import. 4999 passed, 5 pre-existing failures (raytrace d3d12, type conformance cache).

---

### Step 2: Extract `gen_call_data_code` as a free function

Move `BoundVariable.gen_call_data_code` (lines 604–693 of [boundvariable.py](slangpy/bindings/boundvariable.py#L604-L693)) into `generator.py` as a free function, along with the related `gen_calldata_type_name` helper (lines 258–272 of [boundvariable.py](slangpy/bindings/boundvariable.py#L258-L272)).

- **In `generator.py`**: Create two free functions:
- `gen_calldata_type_name(binding: BoundVariable, cgb: CodeGenBlock, type_name: str) -> None` — same logic, takes `binding` as first arg instead of `self`.
- `gen_call_data_code(binding: BoundVariable, cg: CodeGen, context: BindContext, depth: int = 0) -> None` — same logic, recursive calls use the free function. References to `self` become `binding`. Internal calls to `self.gen_calldata_type_name(...)` become `gen_calldata_type_name(binding, ...)`. Recursive calls on children become `gen_call_data_code(child, cg, context, depth + 1)`.
- **In [boundvariable.py](slangpy/bindings/boundvariable.py)**: Replace the method bodies with thin delegations:
```python
def gen_calldata_type_name(self, cgb, type_name):
from slangpy.core.generator import gen_calldata_type_name
gen_calldata_type_name(self, cgb, type_name)

def gen_call_data_code(self, cg, context, depth=0):
from slangpy.core.generator import gen_call_data_code
gen_call_data_code(self, cg, context, depth)
```
This preserves the existing call interface (`node.gen_call_data_code(cg, context)` in [callsignature.py line 406](slangpy/core/callsignature.py#L406)) and any marshall subclass code that calls `self.gen_calldata_type_name`. The `MAX_INLINE_TYPE_LEN` constant moves to `generator.py`.
- **Move** the import of `CodeGen` and `CodeGenBlock` into `generator.py` (already needed for Step 1).

**Verify**: `pytest slangpy/tests/slangpy_tests -v` — all tests pass.

**DONE**: Moved `gen_call_data_code` and `gen_calldata_type_name` to `generator.py` as free functions. `MAX_INLINE_TYPE_LEN` moved to `generator.py`, re-exported from `boundvariable.py`. Method bodies replaced with thin delegation stubs. 3294 passed, 285 kernel gen tests passed.

---

### Step 3a: Extract pure-computation helpers in-place in `callsignature.py`

Extract the two helpers that do **no codegen** — pure calculation/validation only:

- **Extract** `_validate_and_compute_group_shape(build_info, call_data_len) -> tuple[int, list[int], list[int]]` from lines [293–340](slangpy/core/callsignature.py#L293-L340). Returns `(call_group_size, call_group_strides, call_group_shape_vector)`.
- **Extract** `_data_name(x, use_entrypoint_args) -> str` — deduplicate the two inline occurrences at lines [449](slangpy/core/callsignature.py#L449) and [497](slangpy/core/callsignature.py#L497) into a single helper. Returns `__in_{name}`, `call_data.{name}`, or `_param_{name}`.

Leave both in `callsignature.py` as module-private functions. `generate_code` calls them.

**Verify**: `pytest slangpy/tests/slangpy_tests -v` — all tests pass.

---

### Step 3b: Extract "setup" emission functions in-place in `callsignature.py`

Extract the three functions that emit the top section of the generated kernel:

- **Extract** `_emit_link_time_constants(cg, build_info, call_data_len, call_group_size, call_group_strides, call_group_shape_vector)` from lines [342–371](slangpy/core/callsignature.py#L342-L371). Emits `export static const int call_data_len = ...`, group stride/shape arrays; calls `generate_constants()`.
- **Extract** `_emit_shape_and_metadata_params(cg, call_data_len, use_entrypoint_args)` from lines [373–403](slangpy/core/callsignature.py#L373-L403). Emits `_grid_stride`, `_grid_dim`, `_call_dim`, `_thread_count` — as entry-point params (fast path) or `CallData` fields (fallback).
- **Extract** `_emit_call_data_definitions(cg, context, signature)` from lines [405–406](slangpy/core/callsignature.py#L405-L406). Emits per-variable call data (wrapper structs, type aliases, mapping constants) by calling `gen_call_data_code` on each node.

Leave all three in `callsignature.py`. `generate_code` calls them.

**Verify**: `pytest slangpy/tests/slangpy_tests -v` — all tests pass. Run `$env:SLANGPY_PRINT_GENERATED_SHADERS="1"; pytest slangpy/tests/slangpy_tests/test_code_gen.py -v` and capture output as the baseline for Step 3c and 3d.

---

### Step 3c: Extract "body" emission functions in-place in `callsignature.py`

Extract the remaining three functions that emit the entry point and kernel body:

- **Extract** `_emit_trampoline(cg, context, build_info, root_params, use_entrypoint_args)` from lines [408–500](slangpy/core/callsignature.py#L408-L500). Emits `[Differentiable] void _trampoline(...)` — param declarations, loads, function call, stores.
- **Extract** `_emit_entry_point_signature(cg, build_info, call_data_len, call_group_size, use_entrypoint_args)` from lines [503–541](slangpy/core/callsignature.py#L503-L541). Emits `[shader("compute")] [numthreads(...)] void compute_main(...)` or `[shader("raygen")] void raygen_main(...)`.
- **Extract** `_emit_kernel_body(cg, context, build_info, root_params, call_data_len, use_entrypoint_args)` from lines [543–603](slangpy/core/callsignature.py#L543-L603). Emits bounds check, `init_thread_local_call_shape_info`, Context construction, trampoline call.

At this point `generate_code` is reduced to the ~30-line orchestrator below. Still in `callsignature.py`.

```python
def generate_code(context, build_info, signature, cg):
use_entrypoint_args = context.use_entrypoint_args
cg.add_import("slangpy")
call_data_len = context.call_dimensionality

call_group_size, strides, shape = _validate_and_compute_group_shape(build_info, call_data_len)

cg.add_import(build_info.module.name)
if use_entrypoint_args:
cg.skip_call_data = True

_emit_link_time_constants(cg, build_info, call_data_len, call_group_size, strides, shape)
_emit_shape_and_metadata_params(cg, call_data_len, use_entrypoint_args)
_emit_call_data_definitions(cg, context, signature)

root_params = sorted(signature.values(), key=lambda x: x.param_index)

_emit_trampoline(cg, context, build_info, root_params, use_entrypoint_args)
_emit_entry_point_signature(cg, build_info, call_data_len, call_group_size, use_entrypoint_args)
cg.kernel.begin_block()
_emit_kernel_body(cg, context, build_info, root_params, call_data_len, use_entrypoint_args)
cg.kernel.end_block()
```

**Verify**: `pytest slangpy/tests/slangpy_tests -v` — all tests pass. Re-run `$env:SLANGPY_PRINT_GENERATED_SHADERS="1"; pytest slangpy/tests/slangpy_tests/test_code_gen.py -v` and confirm output is byte-identical to the Step 3b baseline.

---

### Step 3d: Move all codegen symbols from `callsignature.py` to `generator.py` and fix imports

Now that everything is neatly decomposed, do the pure mechanical move:

- **Move** all seven `_emit_*`/`_validate_*`/`_data_name` private helpers and the `generate_code` orchestrator from `callsignature.py` into `generator.py`.
- **In [callsignature.py](slangpy/core/callsignature.py)**: Delete the moved code; add `from slangpy.core.generator import generate_code` re-export so any consumer that imports `generate_code` from `callsignature` continues to work.
- **Update [calldata.py](slangpy/core/calldata.py#L8)**: Replace `from slangpy.core.callsignature import *` with explicit imports — binding-pipeline functions from `callsignature`, and `generate_code`, `KernelGenException` from `generator`. This eliminates the wildcard import, making dependencies explicit.

**Verify**: `pytest slangpy/tests/slangpy_tests -v` — all tests pass. Re-run `$env:SLANGPY_PRINT_GENERATED_SHADERS="1"; pytest slangpy/tests/slangpy_tests/test_code_gen.py -v` — output byte-identical to Step 3b baseline.

---

### Step 4: Clean up `callsignature.py`

After Step 3, `callsignature.py` no longer has any codegen functions. Clean up:

- Remove unused imports that were only needed by codegen (`CodeGen`, `PipelineType`, `AccessType`, `NoneMarshall`, `BoundVariableException` if no longer referenced).
- Remove re-exports of moved symbols once [calldata.py](slangpy/core/calldata.py) uses direct imports from `generator`.
- Add `from slangpy.core.generator import KernelGenException, ResolveException` re-exports **only if** external consumers import them from `callsignature` (check via grep). If only `calldata.py` uses them, the explicit import is sufficient.

**Verify**: `pytest slangpy/tests/slangpy_tests -v`. `pre-commit run --all-files`.

---

### Step 5: Add comments to `generator.py` sub-functions

Enrich each sub-function's docstring with an example of the Slang code it generates, for both the fast path and fallback path. For example:

```python
def _emit_shape_and_metadata_params(
cg: CodeGen,
call_data_len: int,
use_entrypoint_args: bool,
) -> None:
"""Emit shape arrays and _thread_count.

Fast path (entry-point params)::

uniform int[2] _grid_stride
uniform int[2] _grid_dim
uniform int[2] _call_dim
uniform uint3 _thread_count

Fallback (CallData struct fields)::

int[2] _grid_stride;
int[2] _grid_dim;
int[2] _call_dim;
uint3 _thread_count;
"""
```

This is documentation-only, no functional changes.

**Verify**: `pre-commit run --all-files` (formatting check).

---

### Verification

At each step:
```bash
cmake --build --preset windows-msvc-debug
pytest slangpy/tests/slangpy_tests -v
pre-commit run --all-files
```

After Step 3b specifically, capture generated shader output as a baseline; re-run after 3c and 3d to confirm byte-identical output:
```powershell
$env:SLANGPY_PRINT_GENERATED_SHADERS="1"; pytest slangpy/tests/slangpy_tests/test_code_gen.py -v
```

---

### Decisions

- `gen_call_data_code` extracted as free function in `generator.py`; thin delegation stub kept on `BoundVariable` to preserve the method-call interface (`node.gen_call_data_code(cg, context)`) used in `generate_code` and potentially in external/user code.
- `generator.py` lives at `slangpy/core/generator.py` alongside `callsignature.py` and `calldata.py`.
- Wildcard import `from slangpy.core.callsignature import *` in `calldata.py` replaced with explicit imports to make dependencies clear.
- Sub-function names prefixed with `_` (private to the module); only `generate_code`, `generate_constants`, `gen_call_data_code`, `gen_calldata_type_name`, `KernelGenException` are public.

---

### Key Files

| File | Changes |
|------|---------|
| [slangpy/core/generator.py](slangpy/core/generator.py) | **NEW** — `generate_code`, `generate_constants`, `gen_call_data_code`, `gen_calldata_type_name`, `KernelGenException`, private helpers |
| [slangpy/core/callsignature.py](slangpy/core/callsignature.py) | Remove `generate_code`, `generate_constants`, `KernelGenException`, `is_slangpy_vector`; add re-exports from `generator` |
| [slangpy/bindings/boundvariable.py](slangpy/bindings/boundvariable.py) | `gen_call_data_code` and `gen_calldata_type_name` become thin delegation stubs; `MAX_INLINE_TYPE_LEN` moves out |
| [slangpy/core/calldata.py](slangpy/core/calldata.py) | Replace `from slangpy.core.callsignature import *` with explicit imports from `callsignature` and `generator` |
| [slangpy/core/dispatchdata.py](slangpy/core/dispatchdata.py) | Import `generate_constants` from `generator` instead of `callsignature` |
Loading
Loading