#include <stddef.h>
#include <stdint.h>
typedef struct raxNode {
uint32_t iskey:1;
uint32_t isnull:1;
uint32_t iscompr:1;
uint32_t size:29;
unsigned char data[];
} raxNode;
typedef struct rax {
raxNode *head;
uint64_t numele;
uint64_t numnodes;
} rax;
void *malloc(size_t size);
int main() {
rax *rax = malloc(sizeof(*rax)); // PARSING ERROR HERE
return 0;
}
This is completely fine by GCC/Clang and rax inside the sizeof should refer to the local variable rax that has been declared and is being initialized.
Cause
The parser errors because that inner rax is a NAMED_TYPE token from the lexer instead of IDENT. This is due to the lexer hack being buggy: ours only adds a local variable to the lexer hack table after the entire declaration, but actually the initializer in the declaration may already refer to the local variable. Thus at this point, the lexer hack table considers rax to refer to the typedef instead.
Specification
It's extremely difficult to find specification for this, but I believe it might be 6.2.1.7 in N1570:
Structure, union, and enumeration tags have scope that begins just after the appearance of the tag in a type specifier that declares the tag. Each enumeration constant has scope that begins just after the appearance of its defining enumerator in an enumerator list. Any other identifier has scope that begins just after the completion of its declarator.
The declaractor is the *rax part of the declaration in the above example, so rax needs to be added to the lexer hack table as identifier between the declarator and initializer, not after the whole thing.
Test case from goblint/bench#38 (comment)
This is completely fine by GCC/Clang and
raxinside thesizeofshould refer to the local variableraxthat has been declared and is being initialized.Cause
The parser errors because that inner
raxis aNAMED_TYPEtoken from the lexer instead ofIDENT. This is due to the lexer hack being buggy: ours only adds a local variable to the lexer hack table after the entire declaration, but actually the initializer in the declaration may already refer to the local variable. Thus at this point, the lexer hack table considersraxto refer to thetypedefinstead.Specification
It's extremely difficult to find specification for this, but I believe it might be 6.2.1.7 in N1570:
The declaractor is the
*raxpart of the declaration in the above example, soraxneeds to be added to the lexer hack table as identifier between the declarator and initializer, not after the whole thing.