-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[flang][acc] Use non-boxed value for assumed-size in acc data clause #129804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Assumed-size arrays end up looking as follows at the HLFIR level: ``` func.func @_QPsub(%arg0: !fir.ref<!fir.array<?xf64>> {fir.bindc_name = "arr"}) { ... %1 = fir.shape %c-1 : (index) -> !fir.shape<1> %2:2 = hlfir.declare %arg0(%1) dummy_scope %0 {uniq_name = "_QFsubEarr"} : (!fir.ref<!fir.array<?xf64>>, !fir.shape<1>, !fir.dscope) -> (!fir.box<!fir.array<?xf64>>, !fir.ref<!fir.array<?xf64>>) ``` The declare operation produces an entity with Fortran properties (wrapped via a box) or the raw data pointer. The current acc lowering uses the box value. During ConvertHLFIRtoFIR, this leads to a forced materialization of descriptor even though the descriptor itself does not hold useful extent information (it holds -1). Other operations such as those that index the array access the raw pointer directly and thus do not force materialization of descriptor. Since there is nothing useful in descriptor and since at end of the day the acc dialect can accept a pointer to the data, make it consistent. This is useful property because without any acc data clauses, an assumed-size array becomes live-in via raw pointer typically. Thus, the materialization of descriptor is not something acc lowering forces.
@llvm/pr-subscribers-flang-fir-hlfir @llvm/pr-subscribers-openacc Author: Razvan Lupusoru (razvanlupusoru) ChangesAssumed-size arrays end up looking as follows at the HLFIR level:
The declare operation produces an entity with Fortran properties (wrapped via a box) or the raw data pointer. The current acc lowering uses the box value. During ConvertHLFIRtoFIR, this leads to a forced materialization of descriptor even though the descriptor itself does not hold useful extent information (it holds -1). Other operations such as those that index the array access the raw pointer directly and thus do not force materialization of descriptor. Since there is nothing useful in descriptor and since at end of the day the acc dialect can accept a pointer to the data, make it consistent. This is useful property because without any acc data clauses, an assumed-size array becomes live-in via raw pointer typically. Thus, the materialization of descriptor is not something acc lowering forces. Full diff: https://github.com/llvm/llvm-project/pull/129804.diff 2 Files Affected:
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index 3dd35ed9ae481..027e8d29a0515 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -369,6 +369,55 @@ getSymbolFromAccObject(const Fortran::parser::AccObject &accObject) {
llvm::report_fatal_error("Could not find symbol");
}
+static mlir::Value getBaseAddr(Fortran::semantics::Symbol &symbol,
+ const fir::factory::AddrAndBoundsInfo &info) {
+ if (Fortran::semantics::IsAssumedSizeArray(symbol)) {
+ // Assumed-size arrays in FIR are represented as:
+ // func.func @func(%arg0: !fir.ref<!fir.array<?xf64>> {fir.bindc_name = "arr"}) {
+ // %arr:2 = hlfir.declare %arg0(%shape) ... -> (!fir.box<!fir.array<?xf64>>, !fir.ref<!fir.array<?xf64>>)
+ // The `rawInput` refers to the #1 output of the `hlfir.declare` operation.
+ // This is preferred since the Fortran variable properties does not contain
+ // any useful size information.
+ return info.rawInput;
+ }
+
+ if (Fortran::semantics::IsOptional(symbol)) {
+ // When there is an optional argument for which there is a possibility
+ // to create a descriptor, pick the rawInput instead. This is done to
+ // avoid materializing the descriptor which leads to following pattern
+ // generated at the FIR level which adds an extra indirection that makes
+ // recovering original variable not evident.
+ // This is the pattern we want to avoid to be generated:
+ // %1 = fir.declare %arg0 ... {fortran_attrs = #fir.var_attrs<optional>, uniq_name = "_QFsub1Eassumedshapeoptarr"} : (!fir.box<!fir.array<?xf32>>, !fir.dscope) -> !fir.box<!fir.array<?xf32>>
+ // %2 = fir.is_present %1 : (!fir.box<!fir.array<?xf32>>) -> i1
+ // %3 = fir.if %2 -> (!fir.box<!fir.array<?xf32>>) {
+ // %5 = fir.rebox %1 : (!fir.box<!fir.array<?xf32>>) -> !fir.box<!fir.array<?xf32>>
+ // fir.result %5 : !fir.box<!fir.array<?xf32>>
+ // } else {
+ // %5 = fir.absent !fir.box<!fir.array<?xf32>>
+ // fir.result %5 : !fir.box<!fir.array<?xf32>>
+ // }
+ // %4 = acc.copyin var(%3 : !fir.box<!fir.array<?xf32>>) ...
+ //
+ // Instead by picking the rawInput we get the following pattern:
+ // %1 = fir.declare %arg0 ... {fortran_attrs = #fir.var_attrs<optional>, uniq_name = "_QFsub1Eassumedshapeoptarr"} : (!fir.box<!fir.array<?xf32>>, !fir.dscope) -> !fir.box<!fir.array<?xf32>>
+ // %2 = acc.copyin var(%2 : !fir.box<!fir.array<?xf32>>) ...
+ if (fir::unwrapRefType(info.addr.getType()) !=
+ fir::unwrapRefType(info.rawInput.getType())) {
+ return info.rawInput;
+ }
+ }
+
+ // The `addr` field refers to the address of the Fortran entity, but with the
+ // ssa value that when lowered to FIR will include the tied Fortran variable
+ // properties. Additionally, in cases where `unwrapFirBox` is requested,
+ // it refers to the address of the data (either result of fir.box_addr or
+ // result of `fir.if` in case of optional).
+ // Therefore, use the processed address in all cases by default unless it was
+ // deemed through the earlier checks in this routine that it is not useful.
+ return info.addr;
+}
+
template <typename Op>
static void
genDataOperandOperations(const Fortran::parser::AccObjectList &objectList,
@@ -399,13 +448,7 @@ genDataOperandOperations(const Fortran::parser::AccObjectList &objectList,
/*genDefaultBounds=*/generateDefaultBounds);
LLVM_DEBUG(llvm::dbgs() << __func__ << "\n"; info.dump(llvm::dbgs()));
- // If the input value is optional and is not a descriptor, we use the
- // rawInput directly.
- mlir::Value baseAddr = ((fir::unwrapRefType(info.addr.getType()) !=
- fir::unwrapRefType(info.rawInput.getType())) &&
- info.isPresent)
- ? info.rawInput
- : info.addr;
+ mlir::Value baseAddr = getBaseAddr(symbol, info);
Op op = createDataEntryOp<Op>(
builder, operandLocation, baseAddr, asFortran, bounds, structured,
implicit, dataClause, baseAddr.getType(), async, asyncDeviceTypes,
diff --git a/flang/test/Lower/OpenACC/acc-bounds.f90 b/flang/test/Lower/OpenACC/acc-bounds.f90
index 8fea357f116a2..e333cdff122a2 100644
--- a/flang/test/Lower/OpenACC/acc-bounds.f90
+++ b/flang/test/Lower/OpenACC/acc-bounds.f90
@@ -92,8 +92,7 @@ subroutine acc_undefined_extent(a)
! CHECK: %[[DIMS0:.*]]:3 = fir.box_dims %[[DECL_ARG0]]#0, %c0{{.*}} : (!fir.box<!fir.array<?xf32>>, index) -> (index, index, index)
! CHECK: %[[UB:.*]] = arith.subi %[[DIMS0]]#1, %c1{{.*}} : index
! CHECK: %[[BOUND:.*]] = acc.bounds lowerbound(%c0{{.*}} : index) upperbound(%[[UB]] : index) extent(%[[DIMS0]]#1 : index) stride(%[[DIMS0]]#2 : index) startIdx(%c1{{.*}} : index) {strideInBytes = true}
-! CHECK: %[[ADDR:.*]] = fir.box_addr %[[DECL_ARG0]]#0 : (!fir.box<!fir.array<?xf32>>) -> !fir.ref<!fir.array<?xf32>>
-! CHECK: %[[PRESENT:.*]] = acc.present varPtr(%[[ADDR]] : !fir.ref<!fir.array<?xf32>>) bounds(%[[BOUND]]) -> !fir.ref<!fir.array<?xf32>> {name = "a"}
+! CHECK: %[[PRESENT:.*]] = acc.present varPtr(%[[DECL_ARG0]]#1 : !fir.ref<!fir.array<?xf32>>) bounds(%[[BOUND]]) -> !fir.ref<!fir.array<?xf32>> {name = "a"}
! CHECK: acc.kernels dataOperands(%[[PRESENT]] : !fir.ref<!fir.array<?xf32>>)
subroutine acc_multi_strides(a)
|
You can test this locally with the following command:git-clang-format --diff e697c99b63224069daa3814f536a69fecab8cd4e 34fcdd0b563f3fd29591fa459de1ed7309a58b2b --extensions cpp -- flang/lib/Lower/OpenACC.cpp View the diff from clang-format here.diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index 027e8d29a0..ca21bbc02d 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -373,8 +373,10 @@ static mlir::Value getBaseAddr(Fortran::semantics::Symbol &symbol,
const fir::factory::AddrAndBoundsInfo &info) {
if (Fortran::semantics::IsAssumedSizeArray(symbol)) {
// Assumed-size arrays in FIR are represented as:
- // func.func @func(%arg0: !fir.ref<!fir.array<?xf64>> {fir.bindc_name = "arr"}) {
- // %arr:2 = hlfir.declare %arg0(%shape) ... -> (!fir.box<!fir.array<?xf64>>, !fir.ref<!fir.array<?xf64>>)
+ // func.func @func(%arg0: !fir.ref<!fir.array<?xf64>> {fir.bindc_name =
+ // "arr"}) {
+ // %arr:2 = hlfir.declare %arg0(%shape) ... ->
+ // (!fir.box<!fir.array<?xf64>>, !fir.ref<!fir.array<?xf64>>)
// The `rawInput` refers to the #1 output of the `hlfir.declare` operation.
// This is preferred since the Fortran variable properties does not contain
// any useful size information.
@@ -388,11 +390,15 @@ static mlir::Value getBaseAddr(Fortran::semantics::Symbol &symbol,
// generated at the FIR level which adds an extra indirection that makes
// recovering original variable not evident.
// This is the pattern we want to avoid to be generated:
- // %1 = fir.declare %arg0 ... {fortran_attrs = #fir.var_attrs<optional>, uniq_name = "_QFsub1Eassumedshapeoptarr"} : (!fir.box<!fir.array<?xf32>>, !fir.dscope) -> !fir.box<!fir.array<?xf32>>
- // %2 = fir.is_present %1 : (!fir.box<!fir.array<?xf32>>) -> i1
- // %3 = fir.if %2 -> (!fir.box<!fir.array<?xf32>>) {
- // %5 = fir.rebox %1 : (!fir.box<!fir.array<?xf32>>) -> !fir.box<!fir.array<?xf32>>
- // fir.result %5 : !fir.box<!fir.array<?xf32>>
+ // %1 = fir.declare %arg0 ... {fortran_attrs = #fir.var_attrs<optional>,
+ // uniq_name = "_QFsub1Eassumedshapeoptarr"} :
+ // (!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
+ // !fir.box<!fir.array<?xf32>> %2 = fir.is_present %1 :
+ // (!fir.box<!fir.array<?xf32>>) -> i1 %3 = fir.if %2 ->
+ // (!fir.box<!fir.array<?xf32>>) {
+ // %5 = fir.rebox %1 : (!fir.box<!fir.array<?xf32>>) ->
+ // !fir.box<!fir.array<?xf32>> fir.result %5 :
+ // !fir.box<!fir.array<?xf32>>
// } else {
// %5 = fir.absent !fir.box<!fir.array<?xf32>>
// fir.result %5 : !fir.box<!fir.array<?xf32>>
@@ -400,8 +406,11 @@ static mlir::Value getBaseAddr(Fortran::semantics::Symbol &symbol,
// %4 = acc.copyin var(%3 : !fir.box<!fir.array<?xf32>>) ...
//
// Instead by picking the rawInput we get the following pattern:
- // %1 = fir.declare %arg0 ... {fortran_attrs = #fir.var_attrs<optional>, uniq_name = "_QFsub1Eassumedshapeoptarr"} : (!fir.box<!fir.array<?xf32>>, !fir.dscope) -> !fir.box<!fir.array<?xf32>>
- // %2 = acc.copyin var(%2 : !fir.box<!fir.array<?xf32>>) ...
+ // %1 = fir.declare %arg0 ... {fortran_attrs = #fir.var_attrs<optional>,
+ // uniq_name = "_QFsub1Eassumedshapeoptarr"} :
+ // (!fir.box<!fir.array<?xf32>>, !fir.dscope) ->
+ // !fir.box<!fir.array<?xf32>> %2 = acc.copyin var(%2 :
+ // !fir.box<!fir.array<?xf32>>) ...
if (fir::unwrapRefType(info.addr.getType()) !=
fir::unwrapRefType(info.rawInput.getType())) {
return info.rawInput;
|
Assumed-size arrays end up looking as follows at the HLFIR level:
The declare operation produces an entity with Fortran properties (wrapped via a box) or the raw data pointer. The current acc lowering uses the box value.
During ConvertHLFIRtoFIR, this leads to a forced materialization of descriptor even though the descriptor itself does not hold useful extent information (it holds -1). Other operations such as those that index the array access the raw pointer directly and thus do not force materialization of descriptor.
Since there is nothing useful in descriptor and since at end of the day the acc dialect can accept a pointer to the data, make it consistent. This is useful property because without any acc data clauses, an assumed-size array becomes live-in via raw pointer typically. Thus, the materialization of descriptor is not something acc lowering forces.