Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CIR][ABI][AArch64][Lowering] Fix calls for struct types > 128 bits #1335

Merged
merged 1 commit into from
Feb 12, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1166,7 +1166,13 @@ mlir::Value LowerFunction::rewriteCallOp(const LowerFunctionInfo &CallInfo,
if (::cir::MissingFeatures::undef())
cir_cconv_unreachable("NYI");

IRCallArgs[FirstIRArg] = alloca;
// TODO(cir): add check for cases where we don't need the memcpy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit afraid on how we're gonna remember to tackle this later, any major issue that prevents it to be treated right away?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, there isn't enough information in CIR currently to determine when we don't need the copy. I plan to add these incrementally. Also, if it makes it any better, the cases where we don't need the copy and much more rarer compared to the cases where we do -:)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incremental is fine, I'm mostly curious about the C source that leads to the case where we don't need a copy (I'm assuming that if you made that comment you are coming from somewhere?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, the reason for the comment is here

auto tmpAlloca = createTmpAlloca(
*this, alloca.getLoc(),
mlir::cast<PointerType>(alloca.getType()).getPointee());
auto tySize = LM.getDataLayout().getTypeAllocSize(I->getType());
createMemCpy(*this, tmpAlloca, alloca, tySize.getFixedValue());
IRCallArgs[FirstIRArg] = tmpAlloca;

// NOTE(cir): Skipping Emissions, lifetime markers.

Expand Down
24 changes: 14 additions & 10 deletions clang/test/CIR/CallConvLowering/AArch64/aarch64-cc-structs.c
Original file line number Diff line number Diff line change
Expand Up @@ -171,20 +171,24 @@ GT_128 get_gt_128(GT_128 s) {
}

// CHECK: cir.func no_proto @call_and_get_gt_128(%arg0: !cir.ptr<!ty_GT_128_>
// CHECK: %[[#V0:]] = cir.alloca !ty_GT_128_, !cir.ptr<!ty_GT_128_>, {{.*}} {alignment = 8 : i64}
// CHECK: %[[#V1:]] = cir.alloca !ty_GT_128_, !cir.ptr<!ty_GT_128_>, {{.*}} {alignment = 8 : i64}
// CHECK: cir.call @get_gt_128(%[[#V1]], %arg0) : (!cir.ptr<!ty_GT_128_>, !cir.ptr<!ty_GT_128_>) -> ()
// CHECK: %[[#V2:]] = cir.load %[[#V1]] : !cir.ptr<!ty_GT_128_>, !ty_GT_128_
// CHECK: cir.store %[[#V2]], %[[#V0]] : !ty_GT_128_, !cir.ptr<!ty_GT_128_>
// CHECK: %[[#V0:]] = cir.alloca !ty_GT_128_, !cir.ptr<!ty_GT_128_>, ["tmp"] {alignment = 8 : i64}
// CHECK: %[[#V1:]] = cir.load %arg0 : !cir.ptr<!ty_GT_128_>, !ty_GT_128_
// CHECK: %[[#V2:]] = cir.alloca !ty_GT_128_, !cir.ptr<!ty_GT_128_>, [""] {alignment = 8 : i64}
// CHECK: %[[#V3:]] = cir.alloca !ty_GT_128_, !cir.ptr<!ty_GT_128_>, ["tmp"] {alignment = 8 : i64}
// CHECK: %[[#V4:]] = cir.cast(bitcast, %arg0 : !cir.ptr<!ty_GT_128_>), !cir.ptr<!void>
// CHECK: %[[#V5:]] = cir.cast(bitcast, %[[#V3]] : !cir.ptr<!ty_GT_128_>), !cir.ptr<!void>
// CHECK: %[[#V6:]] = cir.const #cir.int<24> : !u64i
// CHECK: cir.libc.memcpy %[[#V6]] bytes from %[[#V4]] to %[[#V5]] : !u64i, !cir.ptr<!void> -> !cir.ptr<!void>
// CHECK: cir.call @get_gt_128(%[[#V2]], %[[#V3]]) : (!cir.ptr<!ty_GT_128_>, !cir.ptr<!ty_GT_128_>) -> ()
// CHECK: cir.return

// LLVM: void @call_and_get_gt_128(ptr %[[#V0:]])
// LLVM: %[[#V2:]] = alloca %struct.GT_128, i64 1, align 8
// LLVM: %[[#V3:]] = alloca %struct.GT_128, i64 1, align 8
// LLVM: call void @get_gt_128(ptr %[[#V3]], ptr %[[#V0]])
// LLVM: %[[#V4:]] = load %struct.GT_128, ptr %[[#V3]], align 8
// LLVM: store %struct.GT_128 %[[#V4]], ptr %[[#V2]], align 8
// LLVM: ret void
// LLVM: %[[#V3:]] = load %struct.GT_128, ptr %[[#V0]], align 8
// LLVM: %[[#V4:]] = alloca %struct.GT_128, i64 1, align 8
// LLVM: %[[#V5:]] = alloca %struct.GT_128, i64 1, align 8
// LLVM: call void @llvm.memcpy.p0.p0.i64(ptr %[[#V5]], ptr %[[#V0]], i64 24, i1 false)
// LLVM: call void @get_gt_128(ptr %[[#V4]], ptr %[[#V5]])
GT_128 call_and_get_gt_128() {
GT_128 s;
s = get_gt_128(s);
Expand Down