Skip to content

[HostJit] Windows support#9502

Open
miscco wants to merge 6 commits into
NVIDIA:mainfrom
miscco:hostjit_windows
Open

[HostJit] Windows support#9502
miscco wants to merge 6 commits into
NVIDIA:mainfrom
miscco:hostjit_windows

Conversation

@miscco

@miscco miscco commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

This is initial work to suppport HostJit on windows.

We need to mostly work around some dll issues with missing symbols.

We are still failing some tests that need to be investigated, but this serves as a good start

@miscco miscco requested review from a team as code owners June 17, 2026 12:55
@miscco miscco requested a review from bernhardmgruber June 17, 2026 12:55
@github-project-automation github-project-automation Bot moved this to Todo in CCCL Jun 17, 2026
@cccl-authenticator-app cccl-authenticator-app Bot moved this from Todo to In Review in CCCL Jun 17, 2026
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Windows JIT compilation is fixed by adding an _fltused stub, -fno-threadsafe-statics flag, cudaFuncSetAttribute import symbol, and MSVC C4459 suppression. Visibility and attribute macros in libcudacxx are refactored from COMPILER(MSVC) gating to OS(WINDOWS)/COMPILER(NVRTC) gating. cstdint gains Windows-specific int_fast*/uint_fast* type aliases. CubCall code generation switches from an EXPORT macro to _CCCL_VISIBILITY_EXPORT.

Changes

Windows JIT compilation and type alias support

Layer / File(s) Summary
Windows JIT: _fltused stub, compiler flags, and import symbol
c/parallel.v2/src/hostjit/include/hostjit/cuda_minimal/stubs/stdlib.h, c/parallel.v2/src/hostjit/compiler.cpp, c/parallel.v2/CMakeLists.txt
stdlib.h declares extern "C" int _fltused = 0 under _WIN32; compiler.cpp appends -fno-threadsafe-statics to the Windows host compiler args and adds cudaFuncSetAttribute to the minimal cudart.lib import symbol list; CMakeLists.txt suppresses MSVC C4459 via /wd4459.
int_fast*/uint_fast* Windows type aliases and freestanding branch
libcudacxx/include/cuda/std/cstdint
Windows-specific underlying types selected for int_fast8/16/32/64_t and uint_fast8/16/32/64_t via _CCCL_OS(WINDOWS); non-hosted preprocessor path changed from #elif _CCCL_COMPILER(NVRTC) to #else with _CCCL_FREESTANDING() label; INT_FAST*/UINT_FAST* min/max macros split into Windows vs non-Windows branches.

OS-aware visibility and attribute macro refactoring

Layer / File(s) Summary
Visibility and attribute macro refactoring to OS-based conditions
libcudacxx/include/cuda/std/__cccl/visibility.h, libcudacxx/include/cuda/std/__internal/features.h
visibility.h includes os.h and switches _CCCL_VISIBILITY_HIDDEN/_DEFAULT/_EXPORT, _CCCL_NOINLINE, and _CCCL_PUBLIC_* macros from COMPILER(MSVC) to OS(WINDOWS) and COMPILER(NVRTC) gating; features.h includes os.h and updates its platform block guard from _CCCL_COMPILER(MSVC) to _CCCL_OS(WINDOWS).
CubCall code generation: EXPORT to _CCCL_VISIBILITY_EXPORT
c/parallel.v2/src/hostjit/codegen/cub_call.cpp, c/parallel.v2/src/hostjit/include/hostjit/codegen/cub_call.hpp
Generated extern "C" function signatures now emit _CCCL_VISIBILITY_EXPORT instead of EXPORT; shared_includes() and multi-CubCall comments updated to reflect that visibility is controlled via _CCCL_VISIBILITY_EXPORT and symbols resolve under global C-linkage for dlsym lookup.

Possibly related PRs

  • NVIDIA/cccl#9487: Refactors visibility macros in the same header and switches alignment helpers to use the updated public API macros.

Suggested reviewers

  • bernhardmgruber
  • fbusato
  • shwina

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: eb88a771-d376-4cb3-b893-2479707055a8

📥 Commits

Reviewing files that changed from the base of the PR and between 6e245c5 and 3399566.

📒 Files selected for processing (4)
  • c/parallel.v2/CMakeLists.txt
  • c/parallel.v2/src/hostjit/compiler.cpp
  • c/parallel.v2/src/hostjit/include/hostjit/cuda_minimal/stubs/stdlib.h
  • libcudacxx/include/cuda/std/cstdint

Comment thread c/parallel.v2/CMakeLists.txt Outdated
Comment thread libcudacxx/include/cuda/std/cstdint
@github-actions

This comment has been minimized.

@miscco miscco force-pushed the hostjit_windows branch 2 times, most recently from 35efdfb to dc35eef Compare June 22, 2026 08:37

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 29d027cb-f1de-44fc-b640-3b50620eac8a

📥 Commits

Reviewing files that changed from the base of the PR and between 3399566 and 587ae93.

📒 Files selected for processing (9)
  • c/parallel.v2/CMakeLists.txt
  • c/parallel.v2/src/hostjit/codegen/cub_call.cpp
  • c/parallel.v2/src/hostjit/compiler.cpp
  • c/parallel.v2/src/hostjit/include/hostjit/codegen/cub_call.hpp
  • c/parallel.v2/src/hostjit/include/hostjit/cuda_minimal/stubs/stdlib.h
  • libcudacxx/include/cuda/std/__cccl/attributes.h
  • libcudacxx/include/cuda/std/__cccl/visibility.h
  • libcudacxx/include/cuda/std/__internal/features.h
  • libcudacxx/include/cuda/std/cstdint
✅ Files skipped from review due to trivial changes (3)
  • c/parallel.v2/src/hostjit/include/hostjit/codegen/cub_call.hpp
  • c/parallel.v2/CMakeLists.txt
  • c/parallel.v2/src/hostjit/include/hostjit/cuda_minimal/stubs/stdlib.h
🚧 Files skipped from review as they are similar to previous changes (1)
  • c/parallel.v2/src/hostjit/compiler.cpp

Comment thread libcudacxx/include/cuda/std/__cccl/visibility.h Outdated
@miscco miscco force-pushed the hostjit_windows branch from dc35eef to 8ce3429 Compare June 22, 2026 09:58
@github-actions

Copy link
Copy Markdown
Contributor

😬 CI Workflow Results

🟥 Finished in 5h 32m: Pass: 97%/118 | Total: 4d 20h | Max: 4h 19m | Hits: 50%/1119049

See results here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

2 participants