Skip to content

[clang][OpenMP][SPIR-V] Fix AS of globals and set the default AS to 4 #135251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sarnex
Copy link
Member

@sarnex sarnex commented Apr 10, 2025

Based on feedback from #134399, we use the address space map that sets the default AS to 4 for OpenMP SPIR-V offload.

The new AS map had the wrong target AS for opencl_global and opencl_constant, which is what we end up using to get the AS during OpenMP target codegen, so I updated it to match the old default AS 0 map.

After this PR, I will work on simplifying the condition of the old default AS 0 map to eventually only be OCL with no generic addrspace, but there are many failures so I wanted to do it step by step, and this is the first one.

There are relatively minor changes to OpenMP codegen, mostly just addrspacecasts (because globals are AS 1 in SPIR-V, so we need to cast to AS 4/no AS somewhat often) or use the correct address space to create a global.

Copy link

github-actions bot commented Apr 10, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@sarnex sarnex changed the title [clang][OpenMP][SPIR-V] Fix addrspace of globals [clang][OpenMP][SPIR-V] Fix addrspace of globals and set the default AS to 4 Apr 10, 2025
@sarnex sarnex changed the title [clang][OpenMP][SPIR-V] Fix addrspace of globals and set the default AS to 4 [clang][OpenMP][SPIR-V] Fix AS of globals and set the default AS to 4 Apr 10, 2025
@sarnex sarnex marked this pull request as ready for review April 11, 2025 16:47
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. flang:openmp clang:openmp OpenMP related changes to Clang labels Apr 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 11, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-flang-openmp

@llvm/pr-subscribers-clang

Author: Nick Sarnie (sarnex)

Changes

Based on feedback from #134399, we use the address space map that sets the default AS to 4 for OpenMP SPIR-V offload.

The new AS map had the wrong target AS for opencl_global and opencl_constant, which is what we end up using to get the AS during OpenMP target codegen, so I updated it to match the old default AS 0 map.

After this PR, I will work on simplifying the condition of the old default AS 0 map to eventually only be OCL with no generic addrspace, but there are many failures so I wanted to do it step by step, and this is the first one.

There are relatively minor changes to OpenMP codegen, mostly just addrspacecasts (because globals are AS 1 in SPIR-V, so we need to cast to AS 4/no AS somewhat often) or use the correct address space to create a global.


Full diff: https://github.com/llvm/llvm-project/pull/135251.diff

8 Files Affected:

  • (modified) clang/lib/Basic/Targets/SPIR.h (+6-4)
  • (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+22-6)
  • (modified) clang/lib/CodeGen/CGOpenMPRuntime.h (+5)
  • (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (+12-9)
  • (modified) clang/lib/CodeGen/CGStmtOpenMP.cpp (+2)
  • (added) clang/test/OpenMP/spirv_target_addrspace.c (+21)
  • (added) clang/test/OpenMP/spirv_target_addrspace_simd.c (+23)
  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+11-2)
diff --git a/clang/lib/Basic/Targets/SPIR.h b/clang/lib/Basic/Targets/SPIR.h
index 5ea727364d24b..0f4f74ac95749 100644
--- a/clang/lib/Basic/Targets/SPIR.h
+++ b/clang/lib/Basic/Targets/SPIR.h
@@ -57,10 +57,11 @@ static const unsigned SPIRDefIsPrivMap[] = {
 // Used by both the SPIR and SPIR-V targets.
 static const unsigned SPIRDefIsGenMap[] = {
     4, // Default
-    // OpenCL address space values for this map are dummy and they can't be used
-    0, // opencl_global
+    // Some OpenCL address space values for this map are dummy and they can't be
+    // used
+    1, // opencl_global
     0, // opencl_local
-    0, // opencl_constant
+    2, // opencl_constant
     0, // opencl_private
     0, // opencl_generic
     0, // opencl_global_device
@@ -216,7 +217,8 @@ class LLVM_LIBRARY_VISIBILITY BaseSPIRTargetInfo : public TargetInfo {
         /*DefaultIsGeneric=*/Opts.SYCLIsDevice ||
         // The address mapping from HIP/CUDA language for device code is only
         // defined for SPIR-V.
-        (getTriple().isSPIRV() && Opts.CUDAIsDevice));
+        (getTriple().isSPIRV() &&
+         (Opts.CUDAIsDevice || Opts.OpenMPIsTargetDevice)));
   }
 
   void setSupportedOpenCLOpts() override {
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 5736864d4cc6b..5780f1ded3259 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -2524,6 +2524,16 @@ void CGOpenMPRuntime::emitForDispatchInit(
                       Args);
 }
 
+llvm::Value *CGOpenMPRuntime::createRuntimeFunctionArgAddrSpaceCast(
+    CodeGenFunction &CGF, llvm::FunctionCallee RuntimeFcn, size_t ArgIdx,
+    llvm::Value *Arg) {
+  llvm::Type *ParamTy = RuntimeFcn.getFunctionType()->getParamType(ArgIdx);
+  llvm::Type *ArgTy = Arg->getType();
+  if (!ParamTy->isPointerTy())
+    return Arg;
+  return CGF.Builder.CreateAddrSpaceCast(Arg, ParamTy);
+}
+
 void CGOpenMPRuntime::emitForDispatchDeinit(CodeGenFunction &CGF,
                                             SourceLocation Loc) {
   if (!CGF.HaveInsertPoint())
@@ -2572,12 +2582,18 @@ static void emitForStaticInitCall(
       ThreadId,
       CGF.Builder.getInt32(addMonoNonMonoModifier(CGF.CGM, Schedule, M1,
                                                   M2)), // Schedule type
-      Values.IL.emitRawPointer(CGF),                    // &isLastIter
-      Values.LB.emitRawPointer(CGF),                    // &LB
-      Values.UB.emitRawPointer(CGF),                    // &UB
-      Values.ST.emitRawPointer(CGF),                    // &Stride
-      CGF.Builder.getIntN(Values.IVSize, 1),            // Incr
-      Chunk                                             // Chunk
+      CGOpenMPRuntime::createRuntimeFunctionArgAddrSpaceCast(
+          CGF, ForStaticInitFunction, 3,
+          Values.IL.emitRawPointer(CGF)), // &isLastIter
+      CGOpenMPRuntime::createRuntimeFunctionArgAddrSpaceCast(
+          CGF, ForStaticInitFunction, 4, Values.LB.emitRawPointer(CGF)), // &LB
+      CGOpenMPRuntime::createRuntimeFunctionArgAddrSpaceCast(
+          CGF, ForStaticInitFunction, 5, Values.UB.emitRawPointer(CGF)), // &UB
+      CGOpenMPRuntime::createRuntimeFunctionArgAddrSpaceCast(
+          CGF, ForStaticInitFunction, 6,
+          Values.ST.emitRawPointer(CGF)),    // &Stride
+      CGF.Builder.getIntN(Values.IVSize, 1), // Incr
+      Chunk                                  // Chunk
   };
   CGF.EmitRuntimeCall(ForStaticInitFunction, Args);
 }
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.h b/clang/lib/CodeGen/CGOpenMPRuntime.h
index 4321712e1521d..c918c77b4266c 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.h
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.h
@@ -1657,6 +1657,11 @@ class CGOpenMPRuntime {
 
   /// Returns true if the variable is a local variable in untied task.
   bool isLocalVarInUntiedTask(CodeGenFunction &CGF, const VarDecl *VD) const;
+
+  static llvm::Value *
+  createRuntimeFunctionArgAddrSpaceCast(CodeGenFunction &CGF,
+                                        llvm::FunctionCallee RuntimeFcn,
+                                        size_t ArgIdx, llvm::Value *Arg);
 };
 
 /// Class supports emissionof SIMD-only code.
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index f697c13f4c522..0bfa49dee0c53 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -1217,11 +1217,13 @@ void CGOpenMPRuntimeGPU::emitParallelCall(CodeGenFunction &CGF,
     CGBuilderTy &Bld = CGF.Builder;
     llvm::Value *NumThreadsVal = NumThreads;
     llvm::Function *WFn = WrapperFunctionsMap[OutlinedFn];
+    llvm::FunctionCallee RuntimeFn = OMPBuilder.getOrCreateRuntimeFunction(
+        CGM.getModule(), OMPRTL___kmpc_parallel_51);
     llvm::Value *ID = llvm::ConstantPointerNull::get(CGM.Int8PtrTy);
     if (WFn)
       ID = Bld.CreateBitOrPointerCast(WFn, CGM.Int8PtrTy);
-    llvm::Value *FnPtr = Bld.CreateBitOrPointerCast(OutlinedFn, CGM.Int8PtrTy);
-
+    llvm::Value *FnPtr = Bld.CreateAddrSpaceCast(OutlinedFn, CGM.Int8PtrTy);
+    FnPtr = Bld.CreateBitOrPointerCast(FnPtr, CGM.Int8PtrTy);
     // Create a private scope that will globalize the arguments
     // passed from the outside of the target region.
     // TODO: Is that needed?
@@ -1268,14 +1270,15 @@ void CGOpenMPRuntimeGPU::emitParallelCall(CodeGenFunction &CGF,
         IfCondVal,
         NumThreadsVal,
         llvm::ConstantInt::get(CGF.Int32Ty, -1),
-        FnPtr,
-        ID,
-        Bld.CreateBitOrPointerCast(CapturedVarsAddrs.emitRawPointer(CGF),
-                                   CGF.VoidPtrPtrTy),
+        createRuntimeFunctionArgAddrSpaceCast(CGF, RuntimeFn, 5, FnPtr),
+        createRuntimeFunctionArgAddrSpaceCast(CGF, RuntimeFn, 6, ID),
+        createRuntimeFunctionArgAddrSpaceCast(
+            CGF, RuntimeFn, 7,
+            Bld.CreateBitOrPointerCast(CapturedVarsAddrs.emitRawPointer(CGF),
+                                       CGF.VoidPtrPtrTy)),
         llvm::ConstantInt::get(CGM.SizeTy, CapturedVars.size())};
-    CGF.EmitRuntimeCall(OMPBuilder.getOrCreateRuntimeFunction(
-                            CGM.getModule(), OMPRTL___kmpc_parallel_51),
-                        Args);
+
+    CGF.EmitRuntimeCall(RuntimeFn, Args);
   };
 
   RegionCodeGenTy RCG(ParallelGen);
diff --git a/clang/lib/CodeGen/CGStmtOpenMP.cpp b/clang/lib/CodeGen/CGStmtOpenMP.cpp
index 156f64bb5f508..78fd65750fc02 100644
--- a/clang/lib/CodeGen/CGStmtOpenMP.cpp
+++ b/clang/lib/CodeGen/CGStmtOpenMP.cpp
@@ -585,6 +585,8 @@ static llvm::Function *emitOutlinedFunctionPrologue(
     F->removeFnAttr(llvm::Attribute::NoInline);
     F->addFnAttr(llvm::Attribute::AlwaysInline);
   }
+  if (CGM.getTriple().isSPIRV())
+    F->setCallingConv(llvm::CallingConv::SPIR_FUNC);
 
   // Generate the function.
   CGF.StartFunction(CD, Ctx.VoidTy, F, FuncInfo, TargetArgs,
diff --git a/clang/test/OpenMP/spirv_target_addrspace.c b/clang/test/OpenMP/spirv_target_addrspace.c
new file mode 100644
index 0000000000000..9e5eeff73eed6
--- /dev/null
+++ b/clang/test/OpenMP/spirv_target_addrspace.c
@@ -0,0 +1,21 @@
+// RUN: %clang_cc1 -fopenmp -x c++ -triple x86_64-unknown-unknown -fopenmp-targets=spirv64 -emit-llvm-bc %s -o %t-host.bc
+// RUN: %clang_cc1 -fopenmp -fopenmp-targets=spirv64 -fopenmp-is-target-device -triple spirv64 -fopenmp-host-ir-file-path %t-host.bc -emit-llvm %s -o - | FileCheck %s
+
+extern int fcn(const char[]);
+
+#pragma omp declare target
+// CHECK: @global = addrspace(1) global i32 0, align 4
+// CHECK: @.str = private unnamed_addr addrspace(1) constant [4 x i8] c"foo\00", align 1
+int global = 0;
+#pragma omp end declare target
+int main() {
+  // CHECK: = call i32 @__kmpc_target_init(ptr addrspacecast (ptr addrspace(1) @__omp_offloading_{{.*}}_kernel_environment to ptr), ptr %{{.*}})
+  #pragma omp target
+  {
+    for(int i = 0; i < 1024; i++)
+      global++;
+    fcn("foo");
+  }
+  return global;
+}
+
diff --git a/clang/test/OpenMP/spirv_target_addrspace_simd.c b/clang/test/OpenMP/spirv_target_addrspace_simd.c
new file mode 100644
index 0000000000000..31b00ab555596
--- /dev/null
+++ b/clang/test/OpenMP/spirv_target_addrspace_simd.c
@@ -0,0 +1,23 @@
+// RUN: %clang_cc1 -fopenmp -x c++ -triple x86_64-unknown-unknown -fopenmp-targets=spirv64 -emit-llvm-bc %s -o %t-host.bc
+// RUN: %clang_cc1 -fopenmp -fopenmp-targets=spirv64 -fopenmp-is-target-device -triple spirv64 -fopenmp-host-ir-file-path %t-host.bc -emit-llvm %s -o - | FileCheck %s
+
+int main() {
+  int x = 0;
+
+#pragma omp target teams distribute parallel for simd
+  for(int i = 0; i < 1024; i++)
+    x+=i;
+  return x;
+}
+
+// CHECK: @[[#STRLOC:]] = private unnamed_addr addrspace(1) constant [{{.*}} x i8] c{{.*}}, align 1
+// CHECK: @[[#IDENT:]] = private unnamed_addr addrspace(1) constant %struct.ident_t { i32 {{.*}}, i32 2050, i32 {{.*}}, i32 {{.*}}, ptr addrspacecast (ptr addrspace(1) @[[#STRLOC]] to ptr) }, align 8
+// CHECK: define internal spir_func void @__omp_offloading_{{.*}}_omp_outlined(ptr addrspace(4) noalias noundef {{.*}}., ptr addrspace(4) noalias noundef {{.*}}, i64 noundef {{.*}}) #{{.*}} {
+// CHECK: = load ptr addrspace(4), ptr addrspace(4) %{{.*}}, align 8
+// CHECK: = load i32, ptr addrspace(4) %{{.*}}, align 4
+// CHECK: = addrspacecast ptr addrspace(4) %{{.*}} to ptr
+// CHECK: = addrspacecast ptr addrspace(4) %{{.*}} to ptr
+// CHECK: = addrspacecast ptr addrspace(4) %{{.*}} to ptr
+// CHECK: = addrspacecast ptr addrspace(4) %{{.*}} to ptr
+// CHECK: call spir_func void @__kmpc_distribute_static_init{{.*}}(ptr addrspacecast (ptr addrspace(1) @[[#IDENT]] to ptr), i32 %{{.*}}, i32 {{.*}}, ptr %{{.*}}, ptr %{{.*}}, ptr %{{.*}}, ptr %{{.*}}, i32 {{.*}}, i32 %{{.*}})
+// CHECK: call spir_func void @__kmpc_distribute_static_fini{{.*}}(ptr addrspacecast (ptr addrspace(1) @[[#IDENT]] to ptr), i32 %{{.*}})
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 13b727d226738..e7dc82acb9201 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -910,6 +910,14 @@ Constant *OpenMPIRBuilder::getOrCreateIdent(Constant *SrcLocStr,
                              ConstantInt::get(Int32, uint32_t(LocFlags)),
                              ConstantInt::get(Int32, Reserve2Flags),
                              ConstantInt::get(Int32, SrcLocStrSize), SrcLocStr};
+
+    size_t SrcLocStrArgIdx = 4;
+    if (OpenMPIRBuilder::Ident->getElementType(SrcLocStrArgIdx)
+            ->getPointerAddressSpace() !=
+        IdentData[SrcLocStrArgIdx]->getType()->getPointerAddressSpace())
+      IdentData[SrcLocStrArgIdx] = ConstantExpr::getAddrSpaceCast(
+          SrcLocStr, OpenMPIRBuilder::Ident->getElementType(SrcLocStrArgIdx));
+
     Constant *Initializer =
         ConstantStruct::get(OpenMPIRBuilder::Ident, IdentData);
 
@@ -950,8 +958,9 @@ Constant *OpenMPIRBuilder::getOrCreateSrcLocStr(StringRef LocStr,
           GV.getInitializer() == Initializer)
         return SrcLocStr = ConstantExpr::getPointerCast(&GV, Int8Ptr);
 
-    SrcLocStr = Builder.CreateGlobalString(LocStr, /* Name */ "",
-                                           /* AddressSpace */ 0, &M);
+    SrcLocStr = Builder.CreateGlobalString(
+        LocStr, /* Name */ "",
+        M.getDataLayout().getDefaultGlobalsAddressSpace(), &M);
   }
   return SrcLocStr;
 }

@sarnex
Copy link
Member Author

sarnex commented Apr 16, 2025

Ping on this one @jhuber6 @AlexVlx, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:openmp OpenMP related changes to Clang clang Clang issues not falling into any other category flang:openmp
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants