Skip to content

Using 2D grid in cudaKronprod for state initialization#4130

Merged
sacpis merged 4 commits intoNVIDIA:mainfrom
sacpis:0.14_qvector_initialization_segfault
Mar 11, 2026
Merged

Using 2D grid in cudaKronprod for state initialization#4130
sacpis merged 4 commits intoNVIDIA:mainfrom
sacpis:0.14_qvector_initialization_segfault

Conversation

@sacpis
Copy link
Collaborator

@sacpis sacpis commented Mar 10, 2026

Using 2D grid in cudaKronprod to parallelize over both state dimensions.

Fixes QA bug 5961696.

The bug is in kronprod which calls cudaKronprod  as a 1D grid. The kernel uses blockIdx.y / gridDim.y to go over the user state (for n = 30 qubits, it will be 1 billion elments), which is tsize2 parameter. Now, when gridDim.y = 1, only the first block ran. It went over all 2^n elements sequentially. Other blocks just launched and exited.

The 2D grid shows that there are 65535 blocks each handling 4 chunks in parallel.

Used the code below

import cudaq
import numpy as np

cudaq.set_target("nvidia")

@cudaq.kernel
def kernel(vec : cudaq.State):
    p = cudaq.qubit()
    q = cudaq.qvector(vec)

    mz(p)
    mz(q)

n = 30
v = np.zeros(2**n, dtype=cudaq.complex())
v[-1] = 1.
state = cudaq.State.from_data(v)

print(cudaq.sample(kernel, state))

This change now executes the kernel within 1 minute.

Signed-off-by: Sachin Pisal <spisal@nvidia.com>
@sacpis sacpis changed the title Using 2D grid in cudaKronprod Using 2D grid in cudaKronprod for State vector initialization Mar 10, 2026
github-actions bot pushed a commit that referenced this pull request Mar 10, 2026
@github-actions
Copy link

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

@github-actions
Copy link

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

Signed-off-by: Sachin Pisal <spisal@nvidia.com>
@github-actions
Copy link

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants