Description
The new CBMC-SSM feature is described here https://dl.acm.org/doi/pdf/10.1145/3551349.3559523.
The main restriction is that the SSM associates a single shadow byte to each byte of the original program. As described in the paper this is already sufficient to implement a number of analyses such as race-detection, taint analysis, read-before-write analysis, etc.
Other applications would become possible if the SSM module allowed more than one byte of shadow memory per byte of user memory. Allowing up to 8 shadow bytes per byte unlocks the possiblity to store a pointer to an arbitrary struct in shadow memory for each byte of the original program. Rebasing a pointer to the shadow struct pointer would only require to translate the pointer by swap the object bits and apply a 1, 2 or 3 bits left shift the offset bits (i.e. no need for complex arithmetic).
Examples of analyses enabled by 8 shadow bytes per byte:
- Instrumenting stacked borrows rules for GOTO programs generated from Rust code
- 8 shadow bytes are used to store a pointer to a dynamic array representing the borrow stack for each memory location of the user program (cf https://github.com/remi-delmas-3000/stacked-borrows)
- Extending user types with arbitrary ghost fields to express richer program invariants.
- 8 shadow bytes are used to store a pointer to a shadow object that contains the ghost fields for the original object
- For contracts instrumentation, modelling pointer assumptions as ghost state, etc.
From an implementation perspective, it seems like the shadow memory system could be extended to be functionally equivalent to the C model available here https://github.com/remi-delmas-3000/shadow-map/blob/main/shadow_map.h.
Even with a library defined implementation like the one above, we could get decent performance if we could constant-propagate __CPROVER_POINTER_OBJECT(ptr)
and guarantee that field sensitivity is applied to the shadow memory maps during symbolic execution.
In addition if we allowed some sort of controlled iteration mechanism on shadow maps (such as iterating on all shadow objects that are effectively allocated) we could express global properties on the shadow state and model things like address reuse for malloc/free (see here, https://github.com/remi-delmas-3000/shadow-map/blob/main/address_reuse.c)