Skip to content

Commit e3f166b

Browse files
authored
[CIR][CIRGen] Improve emission for array of unions (#1236)
Close #1185 The patch itself seems slightly ad-hoc. As the issue tracked by #1178, the fundamental solution may be to introduce two type systems to solve the inconsistent semantics for Union between LLVM IR and CIR. This will be great to handle other inconsistent semantics between LLVM IR and CIR if any. Back to the patch itself, although the code looks not good initially to me too. But I feel it may be a good workaround since clang/test/CIR/Lowering/union-array.c is an example for array of unions directly and clang/test/CIR/Lowering/nested-union-array.c gives an example for array of unions indirectly (array of structs which contain unions). So I feel we've already covered all the cases. And generally it should be good to use some simple and solid workaround before we introduce the formal full solution.
1 parent 4fb463b commit e3f166b

File tree

2 files changed

+69
-1
lines changed

2 files changed

+69
-1
lines changed

clang/lib/CIR/CodeGen/CIRGenExprConst.cpp

+47-1
Original file line numberDiff line numberDiff line change
@@ -1015,6 +1015,50 @@ class ConstExprEmitter
10151015
return {};
10161016
}
10171017

1018+
auto desiredType = CGM.getTypes().ConvertType(T);
1019+
// FIXME(cir): A hack to handle the emission of arrays of unions directly.
1020+
// See clang/test/CIR/CodeGen/union-array.c and
1021+
// clang/test/CIR/Lowering/nested-union-array.c for example. The root
1022+
// cause of these problems is CIR handles union differently than LLVM IR.
1023+
// So we can't fix the problem fundamentally by mocking LLVM's handling for
1024+
// unions. In LLVM, the union is basically a struct with the largest member
1025+
// of the union and consumers cast the union arbitrarily according to their
1026+
// needs. But in CIR, we tried to express union semantics properly. This is
1027+
// a fundamental difference.
1028+
//
1029+
// Concretely, for the problem here, if we're constructing the initializer
1030+
// for the array of unions, we can't even assume the type of the elements in
1031+
// the initializer are the same! It is odd that we can have an array with
1032+
// different element types. Here we just pretend it is fine by checking if
1033+
// we're constructing an array for an array of unions. If we didn't do so,
1034+
// we may meet problems during lowering to LLVM. To solve the problem, we
1035+
// may need to introduce 2 type systems for CIR: one for the CIR itself and
1036+
// one for lowering. e.g., we can compare the type of CIR during CIRGen,
1037+
// analysis and transformations without worrying the concerns here. And
1038+
// lower to LLVM IR (or anyother dialects) with the proper type.
1039+
//
1040+
// (Although the idea to make CIR's type system self contained and generate
1041+
// LLVM's
1042+
// types in later passes look fine, it has engineering level concern that
1043+
// it will make the skeleton of CIRGen to be diverged from the traditional
1044+
// CodeGen.)
1045+
//
1046+
// Besides union, there are other differences between CIR and LLVM's type
1047+
// system. e.g., LLVM's pointer types are opaque while CIR has concrete
1048+
// pointer types.
1049+
bool isDesiredArrayOfUnionDirectly = [&]() {
1050+
auto desiredArrayType = dyn_cast<cir::ArrayType>(desiredType);
1051+
if (!desiredArrayType)
1052+
return false;
1053+
1054+
auto elementStructType =
1055+
dyn_cast<cir::StructType>(desiredArrayType.getEltType());
1056+
if (!elementStructType)
1057+
return false;
1058+
1059+
return elementStructType.isUnion();
1060+
}();
1061+
10181062
// Emit initializer elements as MLIR attributes and check for common type.
10191063
mlir::Type CommonElementType;
10201064
for (unsigned i = 0; i != NumInitableElts; ++i) {
@@ -1024,10 +1068,12 @@ class ConstExprEmitter
10241068
return {};
10251069
if (i == 0)
10261070
CommonElementType = C.getType();
1071+
else if (isDesiredArrayOfUnionDirectly &&
1072+
C.getType() != CommonElementType)
1073+
CommonElementType = nullptr;
10271074
Elts.push_back(std::move(C));
10281075
}
10291076

1030-
auto desiredType = CGM.getTypes().ConvertType(T);
10311077
auto typedFiller = llvm::dyn_cast_or_null<mlir::TypedAttr>(Filler);
10321078
if (Filler && !typedFiller)
10331079
llvm_unreachable("We shouldn't be receiving untyped attrs here");

clang/test/CIR/CodeGen/union-array.c

+22
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fclangir -emit-cir -fno-clangir-call-conv-lowering %s -o %t.cir
2+
// RUN: FileCheck --input-file=%t.cir %s --check-prefix=CIR
3+
// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fclangir -emit-llvm -fno-clangir-call-conv-lowering %s -o %t.ll
4+
// RUN: FileCheck --input-file=%t.ll %s --check-prefix=LLVM
5+
6+
typedef struct {
7+
char a;
8+
} S_1;
9+
10+
typedef struct {
11+
long a, b;
12+
} S_2;
13+
14+
typedef union {
15+
S_1 a;
16+
S_2 b;
17+
} U;
18+
19+
void foo() { U arr[2] = {{.b = {1, 2}}, {.a = {1}}}; }
20+
21+
// CIR: cir.const #cir.const_struct<{#cir.const_struct<{#cir.const_struct<{#cir.int<1> : !s64i, #cir.int<2> : !s64i}> : {{.*}}}> : {{.*}}, #cir.const_struct<{#cir.const_struct<{#cir.int<1> : !s8i}> : {{.*}}, #cir.const_array<[#cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i, #cir.zero : !u8i]> : !cir.array<!u8i x 15>}>
22+
// LLVM: store { { %struct.S_2 }, { %struct.S_1, [15 x i8] } } { { %struct.S_2 } { %struct.S_2 { i64 1, i64 2 } }, { %struct.S_1, [15 x i8] } { %struct.S_1 { i8 1 }, [15 x i8] zeroinitializer } }

0 commit comments

Comments
 (0)