Description
The behavior of 3C on struct
s is odd when you run on preprocessed code. This issue investigates what might be going on.
struct defs are local ...
a.c
:
struct foo { int *a; };
void bar(struct foo *p) { p->a = 0; }
b.c
:
struct foo { int *a; };
void baz(struct foo *p) { p->a = (int *)1; }
If we call 3c -output-postfix=checked a.c b.c
then the result will be that a.c
's version of struct foo
will have _Ptr<int> a
whereas b.c
's version will have int *a
. This could be viewed as fine: These are two local struct defs that happen to have the same name.
... even if there is data flow
But, suppose I change b.c
to be b2.c
:
struct foo { int *a; };
extern void bar(struct foo *);
void baz(struct foo *p) { p->a = (int *)1; bar(p); }
Now we would prefer that a.c
's version to be the same as b2.c
's version, but it is not: the rewriting is the same as in the first example, where a.c
's version has field _Ptr<int> a
for the field, but the b2.c
version does not.
So this is a bug.
a shared header unifies uses
If I change the original a.c
and b.c
to share a header, that also serves to unify them, even if there is no dataflow.
foo.h
:
struct foo { int *a; };
a3.c
:
#include "foo.h"
void bar(struct foo *p) { p->a = 0; }
b3.c
:
#include "foo.h"
void baz(struct foo *p) { p->a = (int *)1; }
The result here will be int *a
in foo.h
(i.e., the original foo.h
is not changed). The same would be true if we made b3.c
like b2.c
, calling external bar
, too.
... unless it's preprocessed
If I run clang -E
on a3.c
and b3.c
the included the struct
def from foo.h
is inlined in each, e.g., as follows:
# 1 "b3.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 366 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "b.c" 2
# 1 "./foo.h" 1
struct foo {
int *a;
};
# 2 "b3.c" 2
void baz(struct foo *p) {
p->a = (int *)1;
}
When I run 3c
on these already-preprocessed files, it treats the struct
definitions independently, just as was the case for a.c
and b2.c
, above. In other words, the presence of the #1 "b3.c"
etc. in the file does not convince clang
/3c
that there is just a single definition in two places---it is treated as two definitions.
what to do?
There are a couple of ways I can imagine fixing the bug:
-
The simplest way is to treat
struct
definitions as global. We could have a map like we do for functions from the type name to the constraint info. This will be safe, but potentially conservative. -
Another way is to look for dataflow, as in the
a.c
andb2.c
case. This dataflow will signal that we should unify the two definitions. I think what's happening now is that statements likebar(p)
inb2.c
do not look insidep
's type as astruct foo
, essentially assuming that if it's astruct
then it has global scope. Instead, perhaps the code is able to figure out thatp
in the above is referring to onestruct foo
definition while the callee is referring to a differentstruct foo
, and these two should be unified. This unification could possibly take place at the same time we are unifying various prototypes for functions. Or it could happen when processing call expressions.