Skip to content

Relational Analyses: Deep invalidation for library functions havocs all variables (also those that don't have their address taken) #1535

@michael-schwarz

Description

@michael-schwarz

After investigating why almost all relational asserts we have shown for the ESOP paper are now out of reach, it seems it has to do with our invalidation mechanism.

We sometimes specify that things need to be written "deeply", i.e., every piece of memory that is reachable from somewhere needs to be invalidated. This is, e.g., the case for the first argument of fscanf.

When this is an unknown argument, such as stdin, our relational analysis responds by invalidating all relational information it has about any variable.

match reachables ask es with
| None ->
(* top reachable, so try to invalidate everything *)
RD.vars st.rel
|> List.filter_map RV.to_cil_varinfo
|> List.map Cil.var
| Some ad ->

This includes also variables which do not have their address taken and can thus never be pointed to by such pointers.
(N.B. There is a separate option InvalidateGlobals for calls to unknown functions that we assume may modify globals).

// SKIP PARAM: --set ana.activated[+] apron --set sem.int.signed_overflow assume_none
#include <pthread.h>
#include <goblint.h>
#include <stdio.h>

int debug;
int other;

int main() {
  int top;

  // Needed so Base & DefExc doesn't find this information because it invalidates less
  if(top) {
    debug = 3;
  }

  fscanf(stdin, "%d", &other);
 
   // Fails as debug is invalidated
  __goblint_check(debug <= 3);
  return 0;
}

Originally, these TopPointers were filtered out before they reached this analysis, but this has changed recently:
(c.f. #1142 and 83978e9).

As a consequence, our relational analyses are very prone to havoccing all information they track even upon very normal calls to library functions, that, e.g., print something to stdout.

Adding a filter |> List.filter (fun v -> v.vaddrof) above already restores some of the precision, but we should still aim for a more precise version (at least for sensible things such as writing to stdout or reading from stdin).

Interestingly, Base seems to take a more measured approach here, and succeeds in showing these things.

On top of being crucial for my thesis, this might also be of concern for our witness exchange project with Freiburg.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions