Skip to content

[STF] Extract standalone __places project from __stf/places#8189

Merged
andralex merged 36 commits intoNVIDIA:mainfrom
caugonnet:stf_standalone_places_v2
Mar 30, 2026
Merged

[STF] Extract standalone __places project from __stf/places#8189
andralex merged 36 commits intoNVIDIA:mainfrom
caugonnet:stf_standalone_places_v2

Conversation

@caugonnet
Copy link
Copy Markdown
Contributor

Move core place-concept headers (data_place, exec_place, stream_pool, green context, CUDA stream exec place, place_partition) from cudax/__stf/places/ into a new cudax/__places/ directory. Non-core files (tiled_partition, blocked_partition, cyclic_shape, callback_queues) remain in __stf/places/.

All consumers are updated to include from the new __places/ paths directly (no forwarding headers).

Add build/test infrastructure for __places:

  • cudax_ENABLE_PLACES CMake option
  • cudaxPlacesConfigureTarget.cmake for target configuration
  • Header compilation tests in cudaxHeaderTesting.cmake
  • UNITTESTED_FILE support via places_header_unittest.in.cu
  • Test directory with smoke test and header unit tests
  • CMakePresets.json updated to enable places

Remove unused occupancy.cuh include from __places/places.cuh.

Made-with: Cursor

Description

closes

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@caugonnet caugonnet self-assigned this Mar 26, 2026
@caugonnet caugonnet added stf Sequential Task Flow programming model places labels Mar 26, 2026
@github-project-automation github-project-automation bot moved this to Todo in CCCL Mar 26, 2026
@copy-pr-bot
Copy link
Copy Markdown
Contributor

copy-pr-bot bot commented Mar 26, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Mar 26, 2026
@caugonnet
Copy link
Copy Markdown
Contributor Author

/ok to test 21a427b

#include <cuda/experimental/__places/data_place_interface.cuh>
#include <cuda/experimental/__places/exec/green_ctx_view.cuh>
#include <cuda/experimental/__places/places.cuh>
#include <cuda/experimental/__stf/utility/hash.cuh>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see code in __places still includes code in __stf. Is that an intermediary state of affairs and is the long-term plan to excise that dependency? Far as I can tell right now __places and __stf depend on each other.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to decide where to move such utilities ...

@github-actions

This comment has been minimized.

@andralex
Copy link
Copy Markdown
Contributor

/ok to test 9560eaf

@andralex
Copy link
Copy Markdown
Contributor

/ok to test 2570296

Move core place-concept headers (data_place, exec_place, stream_pool,
green context, CUDA stream exec place, place_partition) from
cudax/__stf/places/ into a new cudax/__places/ directory. Non-core
files (tiled_partition, blocked_partition, cyclic_shape, callback_queues)
remain in __stf/places/.

All consumers are updated to include from the new __places/ paths
directly (no forwarding headers).

Add build/test infrastructure for __places:
- cudax_ENABLE_PLACES CMake option
- cudaxPlacesConfigureTarget.cmake for target configuration
- Header compilation tests in cudaxHeaderTesting.cmake
- UNITTESTED_FILE support via places_header_unittest.in.cu
- Test directory with smoke test and header unit tests
- CMakePresets.json updated to enable places

Remove unused occupancy.cuh include from __places/places.cuh.

Made-with: Cursor
…aces/places.cuh

No types or functions from this header are used in places.cuh; the
include was a leftover from before the deferred implementation was
extracted to interpreted_execution_policy_impl.cuh.

Made-with: Cursor
@andralex andralex force-pushed the stf_standalone_places_v2 branch from 4c45bcc to a12219d Compare March 26, 2026 22:08
@andralex
Copy link
Copy Markdown
Contributor

/ok to test a12219d

@github-actions

This comment has been minimized.

Use cuda::experimental::scope_exit instead of the STF-specific
SCOPE(exit) macro. This removes __places' dependency on
scope_guard.cuh (and transitively on unittest.cuh, traits.cuh,
core.cuh), reducing the __stf -> __places coupling.

Made-with: Cursor
@andralex
Copy link
Copy Markdown
Contributor

/ok to test 314272f

The transitive include was broken when interpreted_execution_policy.cuh
was removed from places.cuh.

Made-with: Cursor
@andralex
Copy link
Copy Markdown
Contributor

/ok to test 5bbf017

@github-actions

This comment has been minimized.

@caugonnet
Copy link
Copy Markdown
Contributor Author

/ok to test 3d1cce4

Provides a single include entry point (cuda/experimental/places.cuh)
for the standalone places API, so external consumers don't need to
reference internal __places/ paths or pull in the full stf.cuh.

Made-with: Cursor
@github-actions

This comment has been minimized.

@andralex andralex enabled auto-merge (squash) March 27, 2026 19:13
@cccl-authenticator-app cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Mar 27, 2026
Include these two headers explicitly for discoverability, even though
they are already pulled in transitively.

Made-with: Cursor
@github-actions

This comment has been minimized.

@caugonnet
Copy link
Copy Markdown
Contributor Author

/ok to test 8a7cfe0

@caugonnet
Copy link
Copy Markdown
Contributor Author

/ok to test 006b00d

@github-actions

This comment has been minimized.

Wire data_place_composite::allocate/deallocate to a VMM-backed
localized_array implementation instead of throwing. Add a Thrust
device_vector example that uses data_place as a memory resource,
demonstrating transparent single-device and multi-device (composite)
placement.

Made-with: Cursor
Partitions are a places concept (they define how data maps onto a grid
of places), not an STF task-graph concept. Move blocked_partition,
tiled_partition, and cyclic_shape into __places/partitions/ and update
all include paths and unittested header registrations accordingly.

Made-with: Cursor
@caugonnet
Copy link
Copy Markdown
Contributor Author

/ok to test 09fc48f

@github-actions

This comment has been minimized.

@caugonnet
Copy link
Copy Markdown
Contributor Author

/ok to test feab5e7

@github-actions

This comment has been minimized.

caugonnet and others added 3 commits March 30, 2026 10:46
The test incorrectly assumed blocked_partition maps to a multi-dimensional
grid, but it only partitions along a single dimension (the highest-rank
one) and maps to grid_dims.x places. The test was never registered in
stf_unittested_headers so the bug went unnoticed until this branch.

Made-with: Cursor
- places.cuh: already registered in places_unittested_headers
- logical_data.cuh, parallel_for_scope.cuh: no UNITTEST blocks

Made-with: Cursor
@caugonnet
Copy link
Copy Markdown
Contributor Author

/ok to test 984cd43

@caugonnet
Copy link
Copy Markdown
Contributor Author

/ok to test 51917d1

@github-actions
Copy link
Copy Markdown
Contributor

🥳 CI Workflow Results

🟩 Finished in 5h 56m: Pass: 100%/445 | Total: 5d 15h | Max: 1h 23m | Hits: 98%/514771

See results here.

@andralex andralex merged commit ff4b773 into NVIDIA:main Mar 30, 2026
464 of 465 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in CCCL Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

places stf Sequential Task Flow programming model

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants