Skip to content

[BUG]: cuda.compile(..., output='ltoir') generates invalid IR when C ABI function calls Numba ABI device function through struct field intrinsic #840

@NaderAlAwar

Description

@NaderAlAwar

Describe the bug
When a function compiled with C ABI via cuda.compile(..., output='ltoir') calls a Numba ABI device function after obtaining a pointer through a @intrinsic GEP on a struct, the generated LLVM IR has a return type mismatch. The top-level C ABI wrapper declares i8* as its return type, but the error-propagation branches for the internal Numba ABI calls emit ret i32 <status_code>, which NVVM rejects with ERROR_INVALID_IR.

This is a regression introduced in 0.27.0 by #717. The fix in #782 (shipped in 0.28.0) addressed some mixed C ABI / Numba ABI cases but does not cover this pattern.

Steps/Code to reproduce bug

from numba import cuda, types
from numba.core.extending import intrinsic, register_model, models
from llvmlite import ir

class PairType(types.Type):
    def __init__(self):
        super().__init__(name="Pair")

pair_type = PairType()

@register_model(PairType)
class PairModel(models.StructModel):
    def __init__(self, dmm, fe_type):
        members = [("a", types.int64), ("b", types.int64)]
        super().__init__(dmm, fe_type, members)

@intrinsic
def get_field_a(typingctx, struct_ptr):
    def codegen(context, builder, sig, args):
        return builder.gep(
            args[0],
            [ir.Constant(ir.IntType(32), 0), ir.Constant(ir.IntType(32), 0)],
        )
    from numba.core.datamodel.registry import default_manager
    model = default_manager.lookup(struct_ptr.dtype)
    return types.CPointer(model._members[0])(struct_ptr), codegen

@cuda.jit(device=True)
def increment(ptr, val):
    ptr[0] = ptr[0] + val

def top_level(state, val):
    p = get_field_a(state)
    increment(p, val)

sig = (types.CPointer(pair_type), types.int64)
cuda.compile(top_level, sig, abi_info={"abi_name": "repro"}, output="ltoir")

Expected behavior
The code compiles successfully, as it does on numba-cuda 0.26.0.

Environment details (please complete the following information):

  • Environment location: Bare-metal
  • Method of numba-cuda install: pip

This works with 0.26.0 but fails with later versions

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions