Skip to content

How to treat macros in a safety-critical context? #59

@AlexCeleste

Description

@AlexCeleste

During the meeting of 2025-04-23, the issue was raised of how, or whether, macros can be qualified.

A macro is written to process fragments of the target language. It is not (necessarily) a piece of code in the target language and simply compiling or processing the macro itself does not generally prove that the output fragment will be well-formed (a valid value) - macro processing adds a phase to compilation. If a macro is exposed as part of an API at the boundary of the component to be qualified, this additional phase is exposed to qualification.

MISRA Precedent

Some precedent is provided by C and C++. The preprocessor used by both languages provides an extremely simple substitution macro mechanism, which is not aware of the underlying language syntax at all, and can therefore be used to create or process fragments or invalid syntax; and C++ adds a second, integrated macro language in the form of templates, which are strict about requiring the tokens in the templated block to conform to valid C++ syntax. Although the instantiated semantics of the tokens can change because of overloading and lookup rules, they cannot form fragments and a given token cannot change its syntactic meaning; instantiation can also happen only at specific points in the syntax and is not fully arbitrary. Although a C++ template cannot describe a fragment, the deferred lookup of identifiers means that it may not be valid if instantiated with all types (i.e. a member may be missing which cannot be checked without instantiation).
Both of these macro systems are completely "pure" and only describe substitution-based transformations (either inline into the token stream or into a well-formed AST in instantiation-space). While C++ can run constexpr C++ code during this step, it is restricted to a pure subset.

MISRA C and C++ both mainly aim their guidance at final "programs" rather than at libraries; and more narrowly at translation units rather than at source- or header-based modules. This means the focus is on analysis at the point of creating a binary artifact. A library which provides macros or templates has to do so through header files, which are treated separately for each instance of the file's inclusion into user-TUs, as part of those TUs. This removes the need to consider macros separately from their use context because they do not exist at all in the binary interface and are always directly inlined as text into the client TU, where they are either expanded/instantiated, and the result of expansion is analyzed, or they are unused definitions. MISRA therefore does not concern itself with the definition of either macros or templates outside of a small number of rules aimed only at improving readability. It is not generally considered necessary to check the semantic validity of macro or template syntax because it will be exposed at the point of use.

The expansion of C preprocessor macros can also easily be directly examined because the clearly defined separate compilation phase can generally be run on its own, producing a single file describing the entirety of a TU to be compiled.

TL;DR if you used a C macro or C++ template, the resulting expansion/instantiation can always be analyzed in a non-macro context. If you didn't use it, then no harm was done.

Rust considerations

Rust's declarative and proc-macros fall somewhere in between these systems. In particular both place restrictions on the syntactic positions where a macro can be invoked from, so they neither work on nor produce fragments, although both to varying degrees work with fragments internally.

A rust macro can be exported from a crate. This does not simply inline source code from the origin crate into the user TU in the same way as C; instead the declarations are modular. The definitions are much more "part of" the originating crate than they are in C and C++, because this is not associated with a header-file like mechanism.
However, a similar consideration applies in that: a macro which is used will be expanded during compilation of the crate, and the result of expansion will be checked in context by the compiler. A macro which is not expanded by the compiler, did not impact the crate.

Declarative macros can contain some fragment-like elements which may expand to invalid code. These can, to an extent, be checked out of context by a tool like expandable.

Procedural macros contain arbitrary rust code which translates a TokenStream to a different TokenStream using the full power of the runtime language. These are constrained only by where a macro invocation resulting in a proc-macro can appear in the first place, which is integrated into the syntax; and that the result also needs to be able to expand back into one of these syntactically-constrained contexts.
Unlike C++ constexpr code, Rust proc-macros are not constrained in terms of effects and can access files etc., which adds an area of potential vulnerability. However, because they are defined in Rust code, they can be statically analyzed as Rust code without needing to be expanded in a usage context. Potential issues like "this macro performs filesystem operations" can be detected using the same machinery as static analysis of the main program.

Unlike the C preprocessor, which provides a clearly defined separate compilation phase which can generally be run on its own, Rust macros are not intended to be treated as a distinct phase and it is not recommended to try to create a fully-expanded source file. Although some support for the functionality exists under -Zunpretty=expanded / cargo expand, the output is not semantically correct because it does not capture proper hygienic macro variable lookup. The result ends up having similar "dynamic binding" semantics to C macros, which can be different from the behaviour of the compiled program (and is more likely to be a vulnerability precisely because Rust users expect hygienic macros to respect scope properly).
This weakens a tool for review slightly.

Because of this, useful analysis of code expanded from Rust macros needs to be integrated with the compiler, or applied against the output or an IR produced by the compiler, much like analysis of instantiated C++ templates.

TL;DR unlike in C and C++, the less restricted class of Rust macros are the easier class to analyze out of context, and it is possible to isolate specific undesirable behaviours using normal static analysis. The more restricted class of macros fundamentally shares the same consideration with C and C++ that it is not harmful if it is not used. However, there is a slightly higher barrier to analysis of the expanded code.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions