-
Notifications
You must be signed in to change notification settings - Fork 3
Description
When C programs have local arrays created within a function, LLVM sometimes generates these arrays by first allocating space, then doing a memset into the allocated space. The current method of backtracking from a store instruction during the translation from LLVM to Egg terms fails to track the memset instructions, because memset instructions do not write to a register, but instead use the memory location as an argument. Sometimes memcopy instructions are also generated as well.
Here is a minimal test case that generates memset, without memcopy instructions:
void test(float A[SIZE]) {
for (int i = 0; i < SIZE; i++) {
float x[SIZE] = {0.0f};
for (int j = 0; j < SIZE; j++) {
x[j] = 1.0f;
}
float sum = 0.0f;
for (int j = 0; j < SIZE; j++) {
sum += x[j];
}
A[i] = sum;
}
}
and the clang code it generates, after several optimizations were run:
; Function Attrs: noinline nounwind ssp uwtable
define void @test(float* %0) #0 {
.preheader:
%1 = alloca i64, align 8
%tmpcast = bitcast i64* %1 to [2 x float]*
%2 = bitcast i64* %1 to i8*
%3 = bitcast i64* %1 to float*
store i64 0, i64* %1, align 8
call void @memset_pattern16(i8* nonnull %2, i8* bitcast ([4 x float]* @.memset_pattern to i8*), i64 8) #4
%4 = load float, float* %3, align 8
%5 = fadd float %4, 0.000000e+00
%6 = getelementptr inbounds [2 x float], [2 x float]* %tmpcast, i64 0, i64 1
%7 = load float, float* %6, align 4
%8 = fadd float %5, %7
store float %8, float* %0, align 4
store i64 0, i64* %1, align 8
call void @memset_pattern16(i8* nonnull %2, i8* bitcast ([4 x float]* @.memset_pattern to i8*), i64 8) #4
%9 = load float, float* %3, align 8
%10 = fadd float %9, 0.000000e+00
%11 = load float, float* %6, align 4
%12 = fadd float %10, %11
%13 = getelementptr inbounds float, float* %0, i64 1
store float %12, float* %13, align 4
ret void
}
The Diospyros pass walks back from the store after the instruction definition %13, and recursively looks at arguments of instructions, beginning with the %12 and %13 referenced in the store. Because no instruction references memset (memset has no destination register), the current pass implementation fails to pick up memset.
To fix this issue, I am planning to treat any memset/memcopy instruction like a store, because it has the same basic function as a store. Where the Diospyros pass walks back from stores, it should also walk back from stores/memset/memcopy instructions, and translate recursively backwards into an Egg term. This change will allow Egg expressions to contain memset nodes in the correct position.