-
Notifications
You must be signed in to change notification settings - Fork 61
[BUG]: cuda.compile(..., output='ltoir') generates invalid IR when C ABI function calls Numba ABI device function through struct field intrinsic #840
Description
Describe the bug
When a function compiled with C ABI via cuda.compile(..., output='ltoir') calls a Numba ABI device function after obtaining a pointer through a @intrinsic GEP on a struct, the generated LLVM IR has a return type mismatch. The top-level C ABI wrapper declares i8* as its return type, but the error-propagation branches for the internal Numba ABI calls emit ret i32 <status_code>, which NVVM rejects with ERROR_INVALID_IR.
This is a regression introduced in 0.27.0 by #717. The fix in #782 (shipped in 0.28.0) addressed some mixed C ABI / Numba ABI cases but does not cover this pattern.
Steps/Code to reproduce bug
from numba import cuda, types
from numba.core.extending import intrinsic, register_model, models
from llvmlite import ir
class PairType(types.Type):
def __init__(self):
super().__init__(name="Pair")
pair_type = PairType()
@register_model(PairType)
class PairModel(models.StructModel):
def __init__(self, dmm, fe_type):
members = [("a", types.int64), ("b", types.int64)]
super().__init__(dmm, fe_type, members)
@intrinsic
def get_field_a(typingctx, struct_ptr):
def codegen(context, builder, sig, args):
return builder.gep(
args[0],
[ir.Constant(ir.IntType(32), 0), ir.Constant(ir.IntType(32), 0)],
)
from numba.core.datamodel.registry import default_manager
model = default_manager.lookup(struct_ptr.dtype)
return types.CPointer(model._members[0])(struct_ptr), codegen
@cuda.jit(device=True)
def increment(ptr, val):
ptr[0] = ptr[0] + val
def top_level(state, val):
p = get_field_a(state)
increment(p, val)
sig = (types.CPointer(pair_type), types.int64)
cuda.compile(top_level, sig, abi_info={"abi_name": "repro"}, output="ltoir")Expected behavior
The code compiles successfully, as it does on numba-cuda 0.26.0.
Environment details (please complete the following information):
- Environment location: Bare-metal
- Method of numba-cuda install: pip
This works with 0.26.0 but fails with later versions