Skip to content

Comments

fix(supraseal): add Turing (sm_75) back to CUDA architectures#1038

Merged
snadrus merged 1 commit intomainfrom
fix/supraseal-turing-arch
Feb 19, 2026
Merged

fix(supraseal): add Turing (sm_75) back to CUDA architectures#1038
snadrus merged 1 commit intomainfrom
fix/supraseal-turing-arch

Conversation

@Reiers
Copy link
Contributor

@Reiers Reiers commented Feb 19, 2026

Problem

Building Curio on machines with Turing GPUs (RTX 2060-2080, GTX 1650/1660, Tesla T4) produces a binary that segfaults immediately — even on ./curio --version.

Root Cause

When CUDA 13 support was added to extern/supraseal/build.sh, Volta (sm_70) was correctly removed (CUDA 13 dropped it). However, Turing (sm_75) was also removed even though CUDA 13 still fully supports it:

Minimum supported GPU with CUDA 13:
  Turing (2018+): GTX 1650, 1660, RTX 2060-2080, Tesla T4
  And everything newer (Ampere, Ada Lovelace, Hopper, Blackwell)

The CUDA_ARCH flags only included sm_80+:

CUDA_ARCH="-arch=sm_80 -gencode arch=compute_80,code=sm_80 ..."

The CUDA kernels in pc2.cu are compiled into libsupraseal.a and linked into the curio binary. When the CUDA runtime initializes at binary load time and finds no compatible kernel for sm_75, supraseal's native code crashes → segfault before main() even runs.

Binaries built on Ampere+ machines work fine when copied to Turing machines because the CUDA kernels match the build machine's GPU, and supraseal tasks aren't invoked at startup.

Fix

-CUDA_ARCH="-arch=sm_80 -gencode arch=compute_80,code=sm_80 ..."
+CUDA_ARCH="-arch=sm_75 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 ..."

Affected Hardware

Any Turing GPU building locally with CUDA 13:

  • GeForce: GTX 1650, 1660 (Ti/Super), RTX 2060/2070/2080 (Ti/Super)
  • Data center: Tesla T4

Workaround (until this merges)

DISABLE_SUPRASEAL=1 make build

Or copy a binary built on an Ampere+ machine.

When CUDA 13 support was added, Volta (sm_70) was correctly removed as
CUDA 13 dropped it. However, Turing (sm_75) was also removed even though
CUDA 13 still supports it.

This causes a segfault on `./curio --version` when built on machines with
Turing GPUs (GTX 1650/1660, RTX 2060-2080, Tesla T4). The CUDA kernels
in pc2.cu are compiled only for sm_80+, and the CUDA runtime crashes
during binary initialization when no compatible kernel is found for sm_75.

Binaries built on Ampere+ machines work fine because the embedded kernels
match. But building locally on a Turing machine produces a broken binary.

Adds -gencode arch=compute_75,code=sm_75 back to CUDA_ARCH.
@Reiers Reiers requested a review from a team as a code owner February 19, 2026 20:29
@snadrus
Copy link
Contributor

snadrus commented Feb 19, 2026

great catch!

@snadrus
Copy link
Contributor

snadrus commented Feb 19, 2026

My mistake: the AI says no sm_75 even though the docs say it'll work.

@snadrus snadrus merged commit 404589e into main Feb 19, 2026
22 checks passed
@snadrus snadrus deleted the fix/supraseal-turing-arch branch February 19, 2026 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants