Skip to content

structs are not global : good or bad (relates to preprocessor, in part) ? #499

Open
@mwhicks1

Description

@mwhicks1

The behavior of 3C on structs is odd when you run on preprocessed code. This issue investigates what might be going on.

struct defs are local ...

a.c:

struct foo { int *a; };
void bar(struct foo *p) { p->a = 0; }

b.c:

struct foo { int *a; };
void baz(struct foo *p) { p->a = (int *)1; }

If we call 3c -output-postfix=checked a.c b.c then the result will be that a.c's version of struct foo will have _Ptr<int> a whereas b.c's version will have int *a. This could be viewed as fine: These are two local struct defs that happen to have the same name.

... even if there is data flow

But, suppose I change b.c to be b2.c:

struct foo { int *a; };
extern void bar(struct foo *);
void baz(struct foo *p) { p->a = (int *)1; bar(p); }

Now we would prefer that a.c's version to be the same as b2.c's version, but it is not: the rewriting is the same as in the first example, where a.c's version has field _Ptr<int> a for the field, but the b2.c version does not.

So this is a bug.

a shared header unifies uses

If I change the original a.c and b.c to share a header, that also serves to unify them, even if there is no dataflow.
foo.h:

struct foo { int *a; };

a3.c:

#include "foo.h"
void bar(struct foo *p) { p->a = 0; }

b3.c:

#include "foo.h"
void baz(struct foo *p) { p->a = (int *)1; }

The result here will be int *a in foo.h (i.e., the original foo.h is not changed). The same would be true if we made b3.c like b2.c, calling external bar, too.

... unless it's preprocessed

If I run clang -E on a3.c and b3.c the included the struct def from foo.h is inlined in each, e.g., as follows:

# 1 "b3.c"
# 1 "<built-in>" 1
# 1 "<built-in>" 3
# 366 "<built-in>" 3
# 1 "<command line>" 1
# 1 "<built-in>" 2
# 1 "b.c" 2
# 1 "./foo.h" 1
struct foo {
  int *a;
};
# 2 "b3.c" 2
void baz(struct foo *p) {
  p->a = (int *)1;
}

When I run 3c on these already-preprocessed files, it treats the struct definitions independently, just as was the case for a.c and b2.c, above. In other words, the presence of the #1 "b3.c" etc. in the file does not convince clang/3c that there is just a single definition in two places---it is treated as two definitions.

what to do?

There are a couple of ways I can imagine fixing the bug:

  • The simplest way is to treat struct definitions as global. We could have a map like we do for functions from the type name to the constraint info. This will be safe, but potentially conservative.

  • Another way is to look for dataflow, as in the a.c and b2.c case. This dataflow will signal that we should unify the two definitions. I think what's happening now is that statements like bar(p) in b2.c do not look inside p's type as a struct foo, essentially assuming that if it's a struct then it has global scope. Instead, perhaps the code is able to figure out that p in the above is referring to one struct foo definition while the callee is referring to a different struct foo, and these two should be unified. This unification could possibly take place at the same time we are unifying various prototypes for functions. Or it could happen when processing call expressions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingmacroquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions