-
Notifications
You must be signed in to change notification settings - Fork 60
Description
This code
extern void *memcpy(void *, const void *, unsigned long);
#define N 4
struct outer {
unsigned int foo[N];
void *force_align; // Force pointer alignment on outer
};
void f(struct outer *o, int *data)
{
struct outer onstack;
memcpy(&onstack.foo, data, N * sizeof(int));
memcpy(o->foo, &onstack.foo, N * sizeof(int));
}
when compiled for with -O2 and CHERI enabled results in the following warning:
fail.c:8:6: warning: found underaligned load of capability type (aligned to 4 bytes instead of 16). Will use memcpy() instead of capability load to preserve tags if it is aligned correctly at runtime [-Wcheri-inefficient]
8 | void f(struct outer *o, int *data)
| ^
fail.c:8:6: note: use __builtin_assume_aligned() or cast to (u)intptr_t* if you know that the pointer is actually align
This is obviously bogus because no capability is moved in this code.
What is worse is that the generated code can be considered broken if N is larger.
The net effect of the SROA pass as it is currently implemented is, that the on stack data is forcefully sliced into aligned 16-byte slices. As a result the two memcpy operations are "optimized" by SROA into capability moves, one for each slice. This is independent of the structure size, i.e. there is no fallback to memcpy() for larger structs. Of course non of these capability moves actually make it into the final code because at some later stage it is noticed that there is not enough alignment guarantee. At this point each of the individual capability moves is turned into a memcpy() resulting in the above warning.
So the final code will contain N/4 individual calls to mempcy and produce the same amount of warnings.
Expected behaviour would be
- No warnings for this code
memcpy()operations are not split- For larger structs there is a fallback to
memcpy().