Skip to content

[clang] Redefine noconvergent and generate convergence control tokens #136282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: users/ssahasra/clang-convergence-spec
Choose a base branch
from

Conversation

ssahasra
Copy link
Collaborator

This introduces the -fconvergence-control flag that emits convergence control intrinsics which are then used as the convergencectrl operand bundle on convergent calls.

This also redefines the noconvergent attribute in Clang. The existing simple interpretation is that if a statement is marked noconvergent, then every asm call is treated as a non-convergent operation in the emitted LLVM IR.

The new semantics introduces a more powerful notion that a noconvergent statement may contain convergent operations, but the resulting convergence constraints are limited to the scope of that statement. As a whole the statement itself does not place any convergence constraints on the control flow reaching it. When emitting convergence tokens, this attribute results in a call to the anchor intrinsic that determines convergence within the statement.

This introduces the `-fconvergence-control` flag that emits convergence control
intrinsics which are then used as the `convergencectrl` operand bundle on
convergent calls.

This also redefines the `noconvergent` attribute in Clang. The existing simple
interpretation is that if a statement is marked `noconvergent`, then every asm
call is treated as a non-convergent operation in the emitted LLVM IR.

The new semantics introduces a more powerful notion that a `noconvergent`
statement may contain convergent operations, but the resulting convergence
constraints are limited to the scope of that statement. As a whole the statement
itself does not place any convergence constraints on the control flow reaching
it. When emitting convergence tokens, this attribute results in a call to the
`anchor` intrinsic that determines convergence within the statement.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:ir clang:analysis llvm:transforms labels Apr 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 18, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-transforms

Author: Sameer Sahasrabuddhe (ssahasra)

Changes

This introduces the -fconvergence-control flag that emits convergence control intrinsics which are then used as the convergencectrl operand bundle on convergent calls.

This also redefines the noconvergent attribute in Clang. The existing simple interpretation is that if a statement is marked noconvergent, then every asm call is treated as a non-convergent operation in the emitted LLVM IR.

The new semantics introduces a more powerful notion that a noconvergent statement may contain convergent operations, but the resulting convergence constraints are limited to the scope of that statement. As a whole the statement itself does not place any convergence constraints on the control flow reaching it. When emitting convergence tokens, this attribute results in a call to the anchor intrinsic that determines convergence within the statement.


Patch is 64.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136282.diff

27 Files Affected:

  • (modified) clang/docs/ThreadConvergence.rst (+27)
  • (modified) clang/include/clang/Analysis/Analyses/ConvergenceCheck.h (+2-1)
  • (modified) clang/include/clang/Basic/AttrDocs.td (+9-6)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+2)
  • (modified) clang/include/clang/Basic/LangOptions.def (+2)
  • (modified) clang/include/clang/Driver/Options.td (+5)
  • (modified) clang/lib/Analysis/ConvergenceCheck.cpp (+33-10)
  • (modified) clang/lib/CodeGen/CGCall.cpp (+7-1)
  • (modified) clang/lib/CodeGen/CGStmt.cpp (+31-13)
  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+15-8)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+11-2)
  • (modified) clang/lib/CodeGen/CodeGenModule.h (+1-1)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+3)
  • (modified) clang/lib/Sema/AnalysisBasedWarnings.cpp (+5-3)
  • (added) clang/test/CodeGenHIP/convergence-tokens.hip (+687)
  • (added) clang/test/CodeGenHIP/noconvergent-statement.hip (+109)
  • (added) clang/test/SemaHIP/noconvergent-errors/backwards_jump.hip (+23)
  • (added) clang/test/SemaHIP/noconvergent-errors/jump-into-nest.hip (+32)
  • (added) clang/test/SemaHIP/noconvergent-errors/no-errors.hip (+83)
  • (added) clang/test/SemaHIP/noconvergent-errors/simple_jump.hip (+23)
  • (modified) llvm/include/llvm/IR/InstrTypes.h (+2-6)
  • (modified) llvm/include/llvm/IR/IntrinsicInst.h (+12)
  • (added) llvm/include/llvm/Transforms/Utils/FixConvergenceControl.h (+21)
  • (modified) llvm/lib/IR/Instructions.cpp (+7)
  • (modified) llvm/lib/IR/IntrinsicInst.cpp (+21)
  • (modified) llvm/lib/Transforms/Utils/CMakeLists.txt (+1)
  • (added) llvm/lib/Transforms/Utils/FixConvergenceControl.cpp (+191)
diff --git a/clang/docs/ThreadConvergence.rst b/clang/docs/ThreadConvergence.rst
index d872ab9cb77f5..ce2ca2cbeacde 100644
--- a/clang/docs/ThreadConvergence.rst
+++ b/clang/docs/ThreadConvergence.rst
@@ -564,6 +564,33 @@ backwards ``goto`` instead of a ``while`` statement.
   ``outside_loop``. This includes threads that jumped from ``G2`` as well as
   threads that  reached ``outside_loop`` after executing ``C``.
 
+.. _noconvergent-statement:
+
+The ``noconvergent`` Statement
+==============================
+
+When a statement is marked as ``noconvergent`` the convergence of threads at the
+start of this statement is not constrained by any convergent operations inside
+the statement.
+
+- When two threads execute a statement marked ``noconvergent``, it is
+  implementation-defined whether they are converged at that execution. [Note:
+  The resulting evaluations must still satisfy the strict partial order imposed
+  by convergence-before.]
+- When two threads are converged at the start of this statement (as determined
+  by the implementation), whether they are converged at each convergent
+  operation inside this statement is determined by the usual rules.
+
+For every label statement ``L`` occurring inside a ``noconvergent``
+statement, every ``goto`` or ``switch`` statement that transfers control to
+``L`` must also occur inside that statement.
+
+.. note::
+
+   Convergence control tokens are necessary for correctly implementing the
+   "noconvergent" statement attribute. When tokens are not in use, the legacy
+   behaviour is retained, where the only effect of this attribute is that
+   ``asm`` calls within the statement are not treated as convergent operations.
 
 Implementation-defined Convergence
 ==================================
diff --git a/clang/include/clang/Analysis/Analyses/ConvergenceCheck.h b/clang/include/clang/Analysis/Analyses/ConvergenceCheck.h
index bf0d164c6a5bc..74208889a84df 100644
--- a/clang/include/clang/Analysis/Analyses/ConvergenceCheck.h
+++ b/clang/include/clang/Analysis/Analyses/ConvergenceCheck.h
@@ -18,7 +18,8 @@ class AnalysisDeclContext;
 class Sema;
 class Stmt;
 
-void analyzeForConvergence(Sema &S, AnalysisDeclContext &AC);
+void analyzeForConvergence(Sema &S, AnalysisDeclContext &AC,
+                           bool GenerateWarnings, bool GenerateTokens);
 
 } // end namespace clang
 
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 5f37922d352b7..7ef8d3d86fe50 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -1700,13 +1700,12 @@ def NoConvergentDocs : Documentation {
 This attribute prevents a function from being treated as convergent; when a
 function is marked ``noconvergent``, calls to that function are not
 automatically assumed to be convergent, unless such calls are explicitly marked
-as ``convergent``. If a statement is marked as ``noconvergent``, any calls to
-inline ``asm`` in that statement are no longer treated as convergent.
+as ``convergent``.
 
-In languages following SPMD/SIMT programming model, e.g., CUDA/HIP, function
-declarations and inline asm calls are treated as convergent by default for
-correctness. This ``noconvergent`` attribute is helpful for developers to
-prevent them from being treated as convergent when it's safe.
+If a statement is marked as ``noconvergent``, the semantics depends on whether
+convergence control tokens are used in the generated LLVM IR. When convergence
+control tokens are not in use, any calls to inline ``asm`` in that statement are
+treated as not convergent.
 
 .. code-block:: c
 
@@ -1719,6 +1718,10 @@ prevent them from being treated as convergent when it's safe.
     [[clang::noconvergent]] { asm volatile ("nop"); } // the asm call is non-convergent
   }
 
+When tokens are in use, placing the ``noconvergent`` attribute on a statement
+indicates that thread convergence at the entry to that statement is
+:ref:`implementation-defined<noconvergent-statement>`.
+
   }];
 }
 
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index dabb6d31b519a..3be697c6337bc 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6514,6 +6514,8 @@ def note_goto_affects_convergence : Note<
   "jump from this goto statement affects convergence">;
 def note_switch_case_affects_convergence : Note<
   "jump to this case statement affects convergence of loop">;
+def err_jump_into_noconvergent : Error<
+  "cannot jump into a noconvergent statement from outside">;
 def err_goto_into_protected_scope : Error<
   "cannot jump from this goto statement to its label">;
 def ext_goto_into_protected_scope : ExtWarn<
diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def
index 930c1c06d1a76..c8254af61387b 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -306,6 +306,8 @@ LANGOPT(HIPUseNewLaunchAPI, 1, 0, "Use new kernel launching API for HIP")
 LANGOPT(OffloadUniformBlock, 1, 0, "Assume that kernels are launched with uniform block sizes (default true for CUDA/HIP and false otherwise)")
 LANGOPT(HIPStdPar, 1, 0, "Enable Standard Parallel Algorithm Acceleration for HIP (experimental)")
 LANGOPT(HIPStdParInterposeAlloc, 1, 0, "Replace allocations / deallocations with HIP RT calls when Standard Parallel Algorithm Acceleration for HIP is enabled (Experimental)")
+LANGOPT(ConvergenceControl, 1, 0,
+        "Generate explicit convergence control (experimental)")
 
 LANGOPT(OpenACC           , 1, 0, "OpenACC Enabled")
 
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 830d3459a1320..369929c30a623 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1397,6 +1397,11 @@ def fhip_emit_relocatable : Flag<["-"], "fhip-emit-relocatable">,
   HelpText<"Compile HIP source to relocatable">;
 def fno_hip_emit_relocatable : Flag<["-"], "fno-hip-emit-relocatable">,
   HelpText<"Do not override toolchain to compile HIP source to relocatable">;
+defm convergence_control : BoolFOption<"convergence-control",
+  LangOpts<"ConvergenceControl">, DefaultFalse,
+  PosFlag<SetTrue, [], [ClangOption, CC1Option], "Generate">,
+  NegFlag<SetFalse, [], [ClangOption], "Don't generate">,
+  BothFlags<[], [ClangOption], " explicit convergence control tokens (experimental)">>;
 }
 
 // Clang specific/exclusive options for OpenACC.
diff --git a/clang/lib/Analysis/ConvergenceCheck.cpp b/clang/lib/Analysis/ConvergenceCheck.cpp
index 75139388ea19e..93744f8b8e495 100644
--- a/clang/lib/Analysis/ConvergenceCheck.cpp
+++ b/clang/lib/Analysis/ConvergenceCheck.cpp
@@ -16,6 +16,11 @@
 using namespace clang;
 using namespace llvm;
 
+static void errorJumpIntoNoConvergent(Sema &S, Stmt *From, Stmt *Parent) {
+  S.Diag(Parent->getBeginLoc(), diag::err_jump_into_noconvergent);
+  S.Diag(From->getBeginLoc(), diag::note_goto_affects_convergence);
+}
+
 static void warnGotoCycle(Sema &S, Stmt *From, Stmt *Parent) {
   S.Diag(Parent->getBeginLoc(),
          diag::warn_cycle_created_by_goto_affects_convergence);
@@ -27,7 +32,8 @@ static void warnJumpIntoLoop(Sema &S, Stmt *From, Stmt *Loop) {
   S.Diag(From->getBeginLoc(), diag::note_goto_affects_convergence);
 }
 
-static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM) {
+static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM,
+                                   bool GenerateWarnings, bool GenerateTokens) {
   Stmt *To = From->getLabel()->getStmt();
 
   unsigned ToDepth = PM.getParentDepth(To) + 1;
@@ -42,7 +48,7 @@ static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM) {
   }
 
   // Special case: the goto statement is a descendant of the label statement.
-  if (ExpandedFrom == ExpandedTo) {
+  if (GenerateWarnings && ExpandedFrom == ExpandedTo) {
     assert(ExpandedTo == To);
     warnGotoCycle(S, From, To);
     return;
@@ -60,10 +66,18 @@ static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM) {
 
   SmallVector<Stmt *> Loops;
   for (Stmt *I = To; I != ParentFrom; I = PM.getParent(I)) {
+    if (GenerateTokens)
+      if (const auto *AS = dyn_cast<AttributedStmt>(I))
+        if (hasSpecificAttr<NoConvergentAttr>(AS->getAttrs()))
+          errorJumpIntoNoConvergent(S, From, I);
     // Can't jump into a ranged-for, so we don't need to look for it here.
-    if (isa<ForStmt, WhileStmt, DoStmt>(I))
+    if (GenerateWarnings && isa<ForStmt, WhileStmt, DoStmt>(I))
       Loops.push_back(I);
   }
+
+  if (!GenerateWarnings)
+    return;
+
   for (Stmt *I : reverse(Loops))
     warnJumpIntoLoop(S, From, I);
 
@@ -88,21 +102,29 @@ static void warnSwitchIntoLoop(Sema &S, Stmt *Case, Stmt *Loop) {
 }
 
 static void checkConvergenceForSwitch(Sema &S, SwitchStmt *Switch,
-                                      ParentMap &PM) {
+                                      ParentMap &PM, bool GenerateWarnings,
+                                      bool GenerateTokens) {
   for (SwitchCase *Case = Switch->getSwitchCaseList(); Case;
        Case = Case->getNextSwitchCase()) {
     SmallVector<Stmt *> Loops;
     for (Stmt *I = Case; I != Switch; I = PM.getParent(I)) {
+      if (GenerateTokens)
+        if (const auto *AS = dyn_cast<AttributedStmt>(I))
+          if (hasSpecificAttr<NoConvergentAttr>(AS->getAttrs()))
+            errorJumpIntoNoConvergent(S, Switch, I);
       // Can't jump into a ranged-for, so we don't need to look for it here.
-      if (isa<ForStmt, WhileStmt, DoStmt>(I))
+      if (GenerateWarnings && isa<ForStmt, WhileStmt, DoStmt>(I))
         Loops.push_back(I);
     }
-    for (Stmt *I : reverse(Loops))
-      warnSwitchIntoLoop(S, Case, I);
+    if (GenerateWarnings) {
+      for (Stmt *I : reverse(Loops))
+        warnSwitchIntoLoop(S, Case, I);
+    }
   }
 }
 
-void clang::analyzeForConvergence(Sema &S, AnalysisDeclContext &AC) {
+void clang::analyzeForConvergence(Sema &S, AnalysisDeclContext &AC,
+                                  bool GenerateWarnings, bool GenerateTokens) {
   // Iterating over the CFG helps trim unreachable blocks, and locates Goto
   // statements faster than iterating over the whole body.
   CFG *cfg = AC.getCFG();
@@ -111,9 +133,10 @@ void clang::analyzeForConvergence(Sema &S, AnalysisDeclContext &AC) {
   for (CFGBlock *BI : *cfg) {
     Stmt *Term = BI->getTerminatorStmt();
     if (GotoStmt *Goto = dyn_cast_or_null<GotoStmt>(Term)) {
-      checkConvergenceOnGoto(S, Goto, PM);
+      checkConvergenceOnGoto(S, Goto, PM, GenerateWarnings, GenerateTokens);
     } else if (SwitchStmt *Switch = dyn_cast_or_null<SwitchStmt>(Term)) {
-      checkConvergenceForSwitch(S, Switch, PM);
+      checkConvergenceForSwitch(S, Switch, PM, GenerateWarnings,
+                                GenerateTokens);
     }
   }
 }
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8cb27420dd911..20f251a5ba5b2 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5773,7 +5773,13 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
     Attrs =
         Attrs.addFnAttribute(getLLVMContext(), llvm::Attribute::AlwaysInline);
 
-  // Remove call-site convergent attribute if requested.
+  // Remove call-site convergent attribute if this call occurs inside a
+  // noconvergent statement. This is the legacy behaviour when convergence
+  // control tokens are not in use. It only affects inline asm calls, since all
+  // other function calls inherit the convergent attribute from the callee. When
+  // convergence control tokens are in use, any inline asm calls should be
+  // explicitly marked noconvergent, else they simply inherit whatever token is
+  // currently in scope.
   if (InNoConvergentAttributedStmt)
     Attrs =
         Attrs.removeFnAttribute(getLLVMContext(), llvm::Attribute::Convergent);
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 3562b4ea22a24..1a9a574572f67 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -829,14 +829,24 @@ void CodeGenFunction::EmitAttributedStmt(const AttributedStmt &S) {
     } break;
     }
   }
+  bool LegacyNoConvergent = noconvergent && !CGM.shouldEmitConvergenceTokens();
   SaveAndRestore save_nomerge(InNoMergeAttributedStmt, nomerge);
   SaveAndRestore save_noinline(InNoInlineAttributedStmt, noinline);
   SaveAndRestore save_alwaysinline(InAlwaysInlineAttributedStmt, alwaysinline);
-  SaveAndRestore save_noconvergent(InNoConvergentAttributedStmt, noconvergent);
+  SaveAndRestore save_noconvergent(InNoConvergentAttributedStmt,
+                                   LegacyNoConvergent);
   SaveAndRestore save_musttail(MustTailCall, musttail);
   SaveAndRestore save_flattenOrBranch(HLSLControlFlowAttr, flattenOrBranch);
   CGAtomicOptionsRAII AORAII(CGM, AA);
+  if (noconvergent && CGM.shouldEmitConvergenceTokens()) {
+    EmitBlock(createBasicBlock("noconvergent.anchor"));
+    ConvergenceTokenStack.push_back(
+        emitConvergenceAnchorToken(Builder.GetInsertBlock()));
+  }
   EmitStmt(S.getSubStmt(), S.getAttrs());
+  if (noconvergent && CGM.shouldEmitConvergenceTokens()) {
+    ConvergenceTokenStack.pop_back();
+  }
 }
 
 void CodeGenFunction::EmitGotoStmt(const GotoStmt &S) {
@@ -3317,16 +3327,6 @@ CodeGenFunction::GenerateCapturedStmtFunction(const CapturedStmt &S) {
   return F;
 }
 
-// Returns the first convergence entry/loop/anchor instruction found in |BB|.
-// std::nullptr otherwise.
-static llvm::ConvergenceControlInst *getConvergenceToken(llvm::BasicBlock *BB) {
-  for (auto &I : *BB) {
-    if (auto *CI = dyn_cast<llvm::ConvergenceControlInst>(&I))
-      return CI;
-  }
-  return nullptr;
-}
-
 llvm::CallBase *
 CodeGenFunction::addConvergenceControlToken(llvm::CallBase *Input) {
   llvm::ConvergenceControlInst *ParentToken = ConvergenceTokenStack.back();
@@ -3348,15 +3348,33 @@ CodeGenFunction::emitConvergenceLoopToken(llvm::BasicBlock *BB) {
   return llvm::ConvergenceControlInst::CreateLoop(*BB, ParentToken);
 }
 
+llvm::ConvergenceControlInst *
+CodeGenFunction::emitConvergenceAnchorToken(llvm::BasicBlock *BB) {
+  return llvm::ConvergenceControlInst::CreateAnchor(*BB);
+}
+
 llvm::ConvergenceControlInst *
 CodeGenFunction::getOrEmitConvergenceEntryToken(llvm::Function *F) {
   llvm::BasicBlock *BB = &F->getEntryBlock();
-  llvm::ConvergenceControlInst *Token = getConvergenceToken(BB);
+  llvm::ConvergenceControlInst *Token = llvm::getConvergenceControlDef(*BB);
   if (Token)
     return Token;
 
-  // Adding a convergence token requires the function to be marked as
+  // Adding a convergence entry token requires the function to be marked as
   // convergent.
   F->setConvergent();
   return llvm::ConvergenceControlInst::CreateEntry(*BB);
 }
+
+llvm::ConvergenceControlInst *
+CodeGenFunction::getOrEmitConvergenceAnchorToken(llvm::Function *F) {
+  llvm::BasicBlock *BB = &F->getEntryBlock();
+  llvm::ConvergenceControlInst *Token = llvm::getConvergenceControlDef(*BB);
+  if (Token)
+    return Token;
+
+  // Adding a convergence anchor token requires the function to be marked as
+  // not convergent.
+  F->setNotConvergent();
+  return llvm::ConvergenceControlInst::CreateAnchor(*BB);
+}
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index 4d29ceace646f..d9226bdd775a3 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -47,6 +47,7 @@
 #include "llvm/Support/CRC.h"
 #include "llvm/Support/xxhash.h"
 #include "llvm/Transforms/Scalar/LowerExpectIntrinsic.h"
+#include "llvm/Transforms/Utils/FixConvergenceControl.h"
 #include "llvm/Transforms/Utils/PromoteMemToReg.h"
 #include <optional>
 
@@ -371,12 +372,6 @@ void CodeGenFunction::FinishFunction(SourceLocation EndLoc) {
   assert(DeferredDeactivationCleanupStack.empty() &&
          "mismatched activate/deactivate of cleanups!");
 
-  if (CGM.shouldEmitConvergenceTokens()) {
-    ConvergenceTokenStack.pop_back();
-    assert(ConvergenceTokenStack.empty() &&
-           "mismatched push/pop in convergence stack!");
-  }
-
   bool OnlySimpleReturnStmts = NumSimpleReturnExprs > 0
     && NumSimpleReturnExprs == NumReturnExprs
     && ReturnBlock.getBlock()->use_empty();
@@ -1362,8 +1357,13 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
     if (const auto *VecWidth = CurFuncDecl->getAttr<MinVectorWidthAttr>())
       LargestVectorWidth = VecWidth->getVectorWidth();
 
-  if (CGM.shouldEmitConvergenceTokens())
-    ConvergenceTokenStack.push_back(getOrEmitConvergenceEntryToken(CurFn));
+  if (CGM.shouldEmitConvergenceTokens()) {
+    llvm::ConvergenceControlInst *Token =
+        (FD && FD->hasAttr<NoConvergentAttr>())
+            ? getOrEmitConvergenceAnchorToken(CurFn)
+            : getOrEmitConvergenceEntryToken(CurFn);
+    ConvergenceTokenStack.push_back(Token);
+  }
 }
 
 void CodeGenFunction::EmitFunctionBody(const Stmt *Body) {
@@ -1647,6 +1647,13 @@ void CodeGenFunction::GenerateCode(GlobalDecl GD, llvm::Function *Fn,
     }
   }
 
+  if (CGM.shouldEmitConvergenceTokens()) {
+    ConvergenceTokenStack.pop_back();
+    assert(ConvergenceTokenStack.empty() &&
+           "mismatched push/pop in convergence stack!");
+    fixConvergenceControl(CurFn);
+  }
+
   // Emit the standard function epilogue.
   FinishFunction(BodyRange.getEnd());
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 9254c7077237f..0d20218f6cbf1 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -5339,15 +5339,24 @@ class CodeGenFunction : public CodeGenTypeCache {
   // as it's parent convergence instr.
   llvm::ConvergenceControlInst *emitConvergenceLoopToken(llvm::BasicBlock *BB);
 
+  // Emits a convergence_anchor instruction for the given |BB|.
+  llvm::ConvergenceControlInst *
+  emitConvergenceAnchorToken(llvm::BasicBlock *BB);
+
   // Adds a convergence_ctrl token with |ParentToken| as parent convergence
   // instr to the call |Input|.
   llvm::CallBase *addConvergenceControlToken(llvm::CallBase *Input);
 
-  // Find the convergence_entry instruction |F|, or emits ones if none exists.
-  // Returns the convergence instruction.
+  // Find the convergence control token in the entry block of |F|, or if none
+  // exists, create an entry token.
   llvm::ConvergenceControlInst *
   getOrEmitConvergenceEntryToken(llvm::Function *F);
 
+  // Find the convergence control token in the entry block of |F|, or if none
+  // exists, create an anchor token.
+  llvm::ConvergenceControlInst *
+  getOrEmitConvergenceAnchorToken(llvm::Function *F);
+
 private:
   llvm::MDNode *getRangeForLoadFromType(QualType Ty);
   void EmitReturnOfRValue(RValue RV, QualType Ty);
diff --git a/clang/lib/CodeGen/CodeGenModule.h b/clang/lib/CodeGen/CodeGenModule.h
index 9a0bc675e0baa..1651c87049df8 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -1751,7 +1751,7 @@ class CodeGenModule : public CodeGenTypeCache {
   bool shouldEmitConvergenceTokens() const {
     // TODO: this should probably become unconditional once the controlled
     // convergence becomes the norm.
-    return getTriple().isSPIRVLogical();
+    return getTriple().isSPIRVLogical() || getLangOpts().ConvergenceControl;
   }
 
   void addUndefinedGlobalForTailCall(
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index b2dd4b3b54869..c9e37548fa835 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -7098,6 +7098,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
     if (Args.hasFlag(options::OPT_fhip_new_launch_api,
                      options::OPT_fno_hip_new_launch_api, true))
       CmdArgs.push_back("-fhip-new-launch-api");
+    if (Args.hasFlag(options::OPT_fconvergence_control,
+                     options::OPT_fno_convergence_control, false))
+      CmdArgs.push_back("-fconvergence-control");
     Args.addOptInFlag(CmdArgs, options::OPT_fgpu_allow_device_init,
                       options::OPT_fno_gpu_allow_device_init);
     Args.AddLastArg(CmdArgs, options::OPT_hipstdpar);
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp b/clan...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Apr 18, 2025

@llvm/pr-subscribers-clang-driver

Author: Sameer Sahasrabuddhe (ssahasra)

Changes

This introduces the -fconvergence-control flag that emits convergence control intrinsics which are then used as the convergencectrl operand bundle on convergent calls.

This also redefines the noconvergent attribute in Clang. The existing simple interpretation is that if a statement is marked noconvergent, then every asm call is treated as a non-convergent operation in the emitted LLVM IR.

The new semantics introduces a more powerful notion that a noconvergent statement may contain convergent operations, but the resulting convergence constraints are limited to the scope of that statement. As a whole the statement itself does not place any convergence constraints on the control flow reaching it. When emitting convergence tokens, this attribute results in a call to the anchor intrinsic that determines convergence within the statement.


Patch is 64.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/136282.diff

27 Files Affected:

  • (modified) clang/docs/ThreadConvergence.rst (+27)
  • (modified) clang/include/clang/Analysis/Analyses/ConvergenceCheck.h (+2-1)
  • (modified) clang/include/clang/Basic/AttrDocs.td (+9-6)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+2)
  • (modified) clang/include/clang/Basic/LangOptions.def (+2)
  • (modified) clang/include/clang/Driver/Options.td (+5)
  • (modified) clang/lib/Analysis/ConvergenceCheck.cpp (+33-10)
  • (modified) clang/lib/CodeGen/CGCall.cpp (+7-1)
  • (modified) clang/lib/CodeGen/CGStmt.cpp (+31-13)
  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+15-8)
  • (modified) clang/lib/CodeGen/CodeGenFunction.h (+11-2)
  • (modified) clang/lib/CodeGen/CodeGenModule.h (+1-1)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+3)
  • (modified) clang/lib/Sema/AnalysisBasedWarnings.cpp (+5-3)
  • (added) clang/test/CodeGenHIP/convergence-tokens.hip (+687)
  • (added) clang/test/CodeGenHIP/noconvergent-statement.hip (+109)
  • (added) clang/test/SemaHIP/noconvergent-errors/backwards_jump.hip (+23)
  • (added) clang/test/SemaHIP/noconvergent-errors/jump-into-nest.hip (+32)
  • (added) clang/test/SemaHIP/noconvergent-errors/no-errors.hip (+83)
  • (added) clang/test/SemaHIP/noconvergent-errors/simple_jump.hip (+23)
  • (modified) llvm/include/llvm/IR/InstrTypes.h (+2-6)
  • (modified) llvm/include/llvm/IR/IntrinsicInst.h (+12)
  • (added) llvm/include/llvm/Transforms/Utils/FixConvergenceControl.h (+21)
  • (modified) llvm/lib/IR/Instructions.cpp (+7)
  • (modified) llvm/lib/IR/IntrinsicInst.cpp (+21)
  • (modified) llvm/lib/Transforms/Utils/CMakeLists.txt (+1)
  • (added) llvm/lib/Transforms/Utils/FixConvergenceControl.cpp (+191)
diff --git a/clang/docs/ThreadConvergence.rst b/clang/docs/ThreadConvergence.rst
index d872ab9cb77f5..ce2ca2cbeacde 100644
--- a/clang/docs/ThreadConvergence.rst
+++ b/clang/docs/ThreadConvergence.rst
@@ -564,6 +564,33 @@ backwards ``goto`` instead of a ``while`` statement.
   ``outside_loop``. This includes threads that jumped from ``G2`` as well as
   threads that  reached ``outside_loop`` after executing ``C``.
 
+.. _noconvergent-statement:
+
+The ``noconvergent`` Statement
+==============================
+
+When a statement is marked as ``noconvergent`` the convergence of threads at the
+start of this statement is not constrained by any convergent operations inside
+the statement.
+
+- When two threads execute a statement marked ``noconvergent``, it is
+  implementation-defined whether they are converged at that execution. [Note:
+  The resulting evaluations must still satisfy the strict partial order imposed
+  by convergence-before.]
+- When two threads are converged at the start of this statement (as determined
+  by the implementation), whether they are converged at each convergent
+  operation inside this statement is determined by the usual rules.
+
+For every label statement ``L`` occurring inside a ``noconvergent``
+statement, every ``goto`` or ``switch`` statement that transfers control to
+``L`` must also occur inside that statement.
+
+.. note::
+
+   Convergence control tokens are necessary for correctly implementing the
+   "noconvergent" statement attribute. When tokens are not in use, the legacy
+   behaviour is retained, where the only effect of this attribute is that
+   ``asm`` calls within the statement are not treated as convergent operations.
 
 Implementation-defined Convergence
 ==================================
diff --git a/clang/include/clang/Analysis/Analyses/ConvergenceCheck.h b/clang/include/clang/Analysis/Analyses/ConvergenceCheck.h
index bf0d164c6a5bc..74208889a84df 100644
--- a/clang/include/clang/Analysis/Analyses/ConvergenceCheck.h
+++ b/clang/include/clang/Analysis/Analyses/ConvergenceCheck.h
@@ -18,7 +18,8 @@ class AnalysisDeclContext;
 class Sema;
 class Stmt;
 
-void analyzeForConvergence(Sema &S, AnalysisDeclContext &AC);
+void analyzeForConvergence(Sema &S, AnalysisDeclContext &AC,
+                           bool GenerateWarnings, bool GenerateTokens);
 
 } // end namespace clang
 
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 5f37922d352b7..7ef8d3d86fe50 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -1700,13 +1700,12 @@ def NoConvergentDocs : Documentation {
 This attribute prevents a function from being treated as convergent; when a
 function is marked ``noconvergent``, calls to that function are not
 automatically assumed to be convergent, unless such calls are explicitly marked
-as ``convergent``. If a statement is marked as ``noconvergent``, any calls to
-inline ``asm`` in that statement are no longer treated as convergent.
+as ``convergent``.
 
-In languages following SPMD/SIMT programming model, e.g., CUDA/HIP, function
-declarations and inline asm calls are treated as convergent by default for
-correctness. This ``noconvergent`` attribute is helpful for developers to
-prevent them from being treated as convergent when it's safe.
+If a statement is marked as ``noconvergent``, the semantics depends on whether
+convergence control tokens are used in the generated LLVM IR. When convergence
+control tokens are not in use, any calls to inline ``asm`` in that statement are
+treated as not convergent.
 
 .. code-block:: c
 
@@ -1719,6 +1718,10 @@ prevent them from being treated as convergent when it's safe.
     [[clang::noconvergent]] { asm volatile ("nop"); } // the asm call is non-convergent
   }
 
+When tokens are in use, placing the ``noconvergent`` attribute on a statement
+indicates that thread convergence at the entry to that statement is
+:ref:`implementation-defined<noconvergent-statement>`.
+
   }];
 }
 
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index dabb6d31b519a..3be697c6337bc 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6514,6 +6514,8 @@ def note_goto_affects_convergence : Note<
   "jump from this goto statement affects convergence">;
 def note_switch_case_affects_convergence : Note<
   "jump to this case statement affects convergence of loop">;
+def err_jump_into_noconvergent : Error<
+  "cannot jump into a noconvergent statement from outside">;
 def err_goto_into_protected_scope : Error<
   "cannot jump from this goto statement to its label">;
 def ext_goto_into_protected_scope : ExtWarn<
diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def
index 930c1c06d1a76..c8254af61387b 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -306,6 +306,8 @@ LANGOPT(HIPUseNewLaunchAPI, 1, 0, "Use new kernel launching API for HIP")
 LANGOPT(OffloadUniformBlock, 1, 0, "Assume that kernels are launched with uniform block sizes (default true for CUDA/HIP and false otherwise)")
 LANGOPT(HIPStdPar, 1, 0, "Enable Standard Parallel Algorithm Acceleration for HIP (experimental)")
 LANGOPT(HIPStdParInterposeAlloc, 1, 0, "Replace allocations / deallocations with HIP RT calls when Standard Parallel Algorithm Acceleration for HIP is enabled (Experimental)")
+LANGOPT(ConvergenceControl, 1, 0,
+        "Generate explicit convergence control (experimental)")
 
 LANGOPT(OpenACC           , 1, 0, "OpenACC Enabled")
 
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 830d3459a1320..369929c30a623 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1397,6 +1397,11 @@ def fhip_emit_relocatable : Flag<["-"], "fhip-emit-relocatable">,
   HelpText<"Compile HIP source to relocatable">;
 def fno_hip_emit_relocatable : Flag<["-"], "fno-hip-emit-relocatable">,
   HelpText<"Do not override toolchain to compile HIP source to relocatable">;
+defm convergence_control : BoolFOption<"convergence-control",
+  LangOpts<"ConvergenceControl">, DefaultFalse,
+  PosFlag<SetTrue, [], [ClangOption, CC1Option], "Generate">,
+  NegFlag<SetFalse, [], [ClangOption], "Don't generate">,
+  BothFlags<[], [ClangOption], " explicit convergence control tokens (experimental)">>;
 }
 
 // Clang specific/exclusive options for OpenACC.
diff --git a/clang/lib/Analysis/ConvergenceCheck.cpp b/clang/lib/Analysis/ConvergenceCheck.cpp
index 75139388ea19e..93744f8b8e495 100644
--- a/clang/lib/Analysis/ConvergenceCheck.cpp
+++ b/clang/lib/Analysis/ConvergenceCheck.cpp
@@ -16,6 +16,11 @@
 using namespace clang;
 using namespace llvm;
 
+static void errorJumpIntoNoConvergent(Sema &S, Stmt *From, Stmt *Parent) {
+  S.Diag(Parent->getBeginLoc(), diag::err_jump_into_noconvergent);
+  S.Diag(From->getBeginLoc(), diag::note_goto_affects_convergence);
+}
+
 static void warnGotoCycle(Sema &S, Stmt *From, Stmt *Parent) {
   S.Diag(Parent->getBeginLoc(),
          diag::warn_cycle_created_by_goto_affects_convergence);
@@ -27,7 +32,8 @@ static void warnJumpIntoLoop(Sema &S, Stmt *From, Stmt *Loop) {
   S.Diag(From->getBeginLoc(), diag::note_goto_affects_convergence);
 }
 
-static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM) {
+static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM,
+                                   bool GenerateWarnings, bool GenerateTokens) {
   Stmt *To = From->getLabel()->getStmt();
 
   unsigned ToDepth = PM.getParentDepth(To) + 1;
@@ -42,7 +48,7 @@ static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM) {
   }
 
   // Special case: the goto statement is a descendant of the label statement.
-  if (ExpandedFrom == ExpandedTo) {
+  if (GenerateWarnings && ExpandedFrom == ExpandedTo) {
     assert(ExpandedTo == To);
     warnGotoCycle(S, From, To);
     return;
@@ -60,10 +66,18 @@ static void checkConvergenceOnGoto(Sema &S, GotoStmt *From, ParentMap &PM) {
 
   SmallVector<Stmt *> Loops;
   for (Stmt *I = To; I != ParentFrom; I = PM.getParent(I)) {
+    if (GenerateTokens)
+      if (const auto *AS = dyn_cast<AttributedStmt>(I))
+        if (hasSpecificAttr<NoConvergentAttr>(AS->getAttrs()))
+          errorJumpIntoNoConvergent(S, From, I);
     // Can't jump into a ranged-for, so we don't need to look for it here.
-    if (isa<ForStmt, WhileStmt, DoStmt>(I))
+    if (GenerateWarnings && isa<ForStmt, WhileStmt, DoStmt>(I))
       Loops.push_back(I);
   }
+
+  if (!GenerateWarnings)
+    return;
+
   for (Stmt *I : reverse(Loops))
     warnJumpIntoLoop(S, From, I);
 
@@ -88,21 +102,29 @@ static void warnSwitchIntoLoop(Sema &S, Stmt *Case, Stmt *Loop) {
 }
 
 static void checkConvergenceForSwitch(Sema &S, SwitchStmt *Switch,
-                                      ParentMap &PM) {
+                                      ParentMap &PM, bool GenerateWarnings,
+                                      bool GenerateTokens) {
   for (SwitchCase *Case = Switch->getSwitchCaseList(); Case;
        Case = Case->getNextSwitchCase()) {
     SmallVector<Stmt *> Loops;
     for (Stmt *I = Case; I != Switch; I = PM.getParent(I)) {
+      if (GenerateTokens)
+        if (const auto *AS = dyn_cast<AttributedStmt>(I))
+          if (hasSpecificAttr<NoConvergentAttr>(AS->getAttrs()))
+            errorJumpIntoNoConvergent(S, Switch, I);
       // Can't jump into a ranged-for, so we don't need to look for it here.
-      if (isa<ForStmt, WhileStmt, DoStmt>(I))
+      if (GenerateWarnings && isa<ForStmt, WhileStmt, DoStmt>(I))
         Loops.push_back(I);
     }
-    for (Stmt *I : reverse(Loops))
-      warnSwitchIntoLoop(S, Case, I);
+    if (GenerateWarnings) {
+      for (Stmt *I : reverse(Loops))
+        warnSwitchIntoLoop(S, Case, I);
+    }
   }
 }
 
-void clang::analyzeForConvergence(Sema &S, AnalysisDeclContext &AC) {
+void clang::analyzeForConvergence(Sema &S, AnalysisDeclContext &AC,
+                                  bool GenerateWarnings, bool GenerateTokens) {
   // Iterating over the CFG helps trim unreachable blocks, and locates Goto
   // statements faster than iterating over the whole body.
   CFG *cfg = AC.getCFG();
@@ -111,9 +133,10 @@ void clang::analyzeForConvergence(Sema &S, AnalysisDeclContext &AC) {
   for (CFGBlock *BI : *cfg) {
     Stmt *Term = BI->getTerminatorStmt();
     if (GotoStmt *Goto = dyn_cast_or_null<GotoStmt>(Term)) {
-      checkConvergenceOnGoto(S, Goto, PM);
+      checkConvergenceOnGoto(S, Goto, PM, GenerateWarnings, GenerateTokens);
     } else if (SwitchStmt *Switch = dyn_cast_or_null<SwitchStmt>(Term)) {
-      checkConvergenceForSwitch(S, Switch, PM);
+      checkConvergenceForSwitch(S, Switch, PM, GenerateWarnings,
+                                GenerateTokens);
     }
   }
 }
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 8cb27420dd911..20f251a5ba5b2 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -5773,7 +5773,13 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo,
     Attrs =
         Attrs.addFnAttribute(getLLVMContext(), llvm::Attribute::AlwaysInline);
 
-  // Remove call-site convergent attribute if requested.
+  // Remove call-site convergent attribute if this call occurs inside a
+  // noconvergent statement. This is the legacy behaviour when convergence
+  // control tokens are not in use. It only affects inline asm calls, since all
+  // other function calls inherit the convergent attribute from the callee. When
+  // convergence control tokens are in use, any inline asm calls should be
+  // explicitly marked noconvergent, else they simply inherit whatever token is
+  // currently in scope.
   if (InNoConvergentAttributedStmt)
     Attrs =
         Attrs.removeFnAttribute(getLLVMContext(), llvm::Attribute::Convergent);
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 3562b4ea22a24..1a9a574572f67 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -829,14 +829,24 @@ void CodeGenFunction::EmitAttributedStmt(const AttributedStmt &S) {
     } break;
     }
   }
+  bool LegacyNoConvergent = noconvergent && !CGM.shouldEmitConvergenceTokens();
   SaveAndRestore save_nomerge(InNoMergeAttributedStmt, nomerge);
   SaveAndRestore save_noinline(InNoInlineAttributedStmt, noinline);
   SaveAndRestore save_alwaysinline(InAlwaysInlineAttributedStmt, alwaysinline);
-  SaveAndRestore save_noconvergent(InNoConvergentAttributedStmt, noconvergent);
+  SaveAndRestore save_noconvergent(InNoConvergentAttributedStmt,
+                                   LegacyNoConvergent);
   SaveAndRestore save_musttail(MustTailCall, musttail);
   SaveAndRestore save_flattenOrBranch(HLSLControlFlowAttr, flattenOrBranch);
   CGAtomicOptionsRAII AORAII(CGM, AA);
+  if (noconvergent && CGM.shouldEmitConvergenceTokens()) {
+    EmitBlock(createBasicBlock("noconvergent.anchor"));
+    ConvergenceTokenStack.push_back(
+        emitConvergenceAnchorToken(Builder.GetInsertBlock()));
+  }
   EmitStmt(S.getSubStmt(), S.getAttrs());
+  if (noconvergent && CGM.shouldEmitConvergenceTokens()) {
+    ConvergenceTokenStack.pop_back();
+  }
 }
 
 void CodeGenFunction::EmitGotoStmt(const GotoStmt &S) {
@@ -3317,16 +3327,6 @@ CodeGenFunction::GenerateCapturedStmtFunction(const CapturedStmt &S) {
   return F;
 }
 
-// Returns the first convergence entry/loop/anchor instruction found in |BB|.
-// std::nullptr otherwise.
-static llvm::ConvergenceControlInst *getConvergenceToken(llvm::BasicBlock *BB) {
-  for (auto &I : *BB) {
-    if (auto *CI = dyn_cast<llvm::ConvergenceControlInst>(&I))
-      return CI;
-  }
-  return nullptr;
-}
-
 llvm::CallBase *
 CodeGenFunction::addConvergenceControlToken(llvm::CallBase *Input) {
   llvm::ConvergenceControlInst *ParentToken = ConvergenceTokenStack.back();
@@ -3348,15 +3348,33 @@ CodeGenFunction::emitConvergenceLoopToken(llvm::BasicBlock *BB) {
   return llvm::ConvergenceControlInst::CreateLoop(*BB, ParentToken);
 }
 
+llvm::ConvergenceControlInst *
+CodeGenFunction::emitConvergenceAnchorToken(llvm::BasicBlock *BB) {
+  return llvm::ConvergenceControlInst::CreateAnchor(*BB);
+}
+
 llvm::ConvergenceControlInst *
 CodeGenFunction::getOrEmitConvergenceEntryToken(llvm::Function *F) {
   llvm::BasicBlock *BB = &F->getEntryBlock();
-  llvm::ConvergenceControlInst *Token = getConvergenceToken(BB);
+  llvm::ConvergenceControlInst *Token = llvm::getConvergenceControlDef(*BB);
   if (Token)
     return Token;
 
-  // Adding a convergence token requires the function to be marked as
+  // Adding a convergence entry token requires the function to be marked as
   // convergent.
   F->setConvergent();
   return llvm::ConvergenceControlInst::CreateEntry(*BB);
 }
+
+llvm::ConvergenceControlInst *
+CodeGenFunction::getOrEmitConvergenceAnchorToken(llvm::Function *F) {
+  llvm::BasicBlock *BB = &F->getEntryBlock();
+  llvm::ConvergenceControlInst *Token = llvm::getConvergenceControlDef(*BB);
+  if (Token)
+    return Token;
+
+  // Adding a convergence anchor token requires the function to be marked as
+  // not convergent.
+  F->setNotConvergent();
+  return llvm::ConvergenceControlInst::CreateAnchor(*BB);
+}
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index 4d29ceace646f..d9226bdd775a3 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -47,6 +47,7 @@
 #include "llvm/Support/CRC.h"
 #include "llvm/Support/xxhash.h"
 #include "llvm/Transforms/Scalar/LowerExpectIntrinsic.h"
+#include "llvm/Transforms/Utils/FixConvergenceControl.h"
 #include "llvm/Transforms/Utils/PromoteMemToReg.h"
 #include <optional>
 
@@ -371,12 +372,6 @@ void CodeGenFunction::FinishFunction(SourceLocation EndLoc) {
   assert(DeferredDeactivationCleanupStack.empty() &&
          "mismatched activate/deactivate of cleanups!");
 
-  if (CGM.shouldEmitConvergenceTokens()) {
-    ConvergenceTokenStack.pop_back();
-    assert(ConvergenceTokenStack.empty() &&
-           "mismatched push/pop in convergence stack!");
-  }
-
   bool OnlySimpleReturnStmts = NumSimpleReturnExprs > 0
     && NumSimpleReturnExprs == NumReturnExprs
     && ReturnBlock.getBlock()->use_empty();
@@ -1362,8 +1357,13 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
     if (const auto *VecWidth = CurFuncDecl->getAttr<MinVectorWidthAttr>())
       LargestVectorWidth = VecWidth->getVectorWidth();
 
-  if (CGM.shouldEmitConvergenceTokens())
-    ConvergenceTokenStack.push_back(getOrEmitConvergenceEntryToken(CurFn));
+  if (CGM.shouldEmitConvergenceTokens()) {
+    llvm::ConvergenceControlInst *Token =
+        (FD && FD->hasAttr<NoConvergentAttr>())
+            ? getOrEmitConvergenceAnchorToken(CurFn)
+            : getOrEmitConvergenceEntryToken(CurFn);
+    ConvergenceTokenStack.push_back(Token);
+  }
 }
 
 void CodeGenFunction::EmitFunctionBody(const Stmt *Body) {
@@ -1647,6 +1647,13 @@ void CodeGenFunction::GenerateCode(GlobalDecl GD, llvm::Function *Fn,
     }
   }
 
+  if (CGM.shouldEmitConvergenceTokens()) {
+    ConvergenceTokenStack.pop_back();
+    assert(ConvergenceTokenStack.empty() &&
+           "mismatched push/pop in convergence stack!");
+    fixConvergenceControl(CurFn);
+  }
+
   // Emit the standard function epilogue.
   FinishFunction(BodyRange.getEnd());
 
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 9254c7077237f..0d20218f6cbf1 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -5339,15 +5339,24 @@ class CodeGenFunction : public CodeGenTypeCache {
   // as it's parent convergence instr.
   llvm::ConvergenceControlInst *emitConvergenceLoopToken(llvm::BasicBlock *BB);
 
+  // Emits a convergence_anchor instruction for the given |BB|.
+  llvm::ConvergenceControlInst *
+  emitConvergenceAnchorToken(llvm::BasicBlock *BB);
+
   // Adds a convergence_ctrl token with |ParentToken| as parent convergence
   // instr to the call |Input|.
   llvm::CallBase *addConvergenceControlToken(llvm::CallBase *Input);
 
-  // Find the convergence_entry instruction |F|, or emits ones if none exists.
-  // Returns the convergence instruction.
+  // Find the convergence control token in the entry block of |F|, or if none
+  // exists, create an entry token.
   llvm::ConvergenceControlInst *
   getOrEmitConvergenceEntryToken(llvm::Function *F);
 
+  // Find the convergence control token in the entry block of |F|, or if none
+  // exists, create an anchor token.
+  llvm::ConvergenceControlInst *
+  getOrEmitConvergenceAnchorToken(llvm::Function *F);
+
 private:
   llvm::MDNode *getRangeForLoadFromType(QualType Ty);
   void EmitReturnOfRValue(RValue RV, QualType Ty);
diff --git a/clang/lib/CodeGen/CodeGenModule.h b/clang/lib/CodeGen/CodeGenModule.h
index 9a0bc675e0baa..1651c87049df8 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -1751,7 +1751,7 @@ class CodeGenModule : public CodeGenTypeCache {
   bool shouldEmitConvergenceTokens() const {
     // TODO: this should probably become unconditional once the controlled
     // convergence becomes the norm.
-    return getTriple().isSPIRVLogical();
+    return getTriple().isSPIRVLogical() || getLangOpts().ConvergenceControl;
   }
 
   void addUndefinedGlobalForTailCall(
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index b2dd4b3b54869..c9e37548fa835 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -7098,6 +7098,9 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
     if (Args.hasFlag(options::OPT_fhip_new_launch_api,
                      options::OPT_fno_hip_new_launch_api, true))
       CmdArgs.push_back("-fhip-new-launch-api");
+    if (Args.hasFlag(options::OPT_fconvergence_control,
+                     options::OPT_fno_convergence_control, false))
+      CmdArgs.push_back("-fconvergence-control");
     Args.addOptInFlag(CmdArgs, options::OPT_fgpu_allow_device_init,
                       options::OPT_fno_gpu_allow_device_init);
     Args.AddLastArg(CmdArgs, options::OPT_hipstdpar);
diff --git a/clang/lib/Sema/AnalysisBasedWarnings.cpp b/clan...
[truncated]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:ir llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants