-
Notifications
You must be signed in to change notification settings - Fork 7
Description
git analysis was previously discussed in #14, but since we now tried harder, I'm extracting that to this separate issue for better organization.
Below are comments from there with relevant analysis attempts.
@karoliineh: #14 (comment)
I tried to analyze the git repository, once again. I tried it with generating a compilation database with bear. As there are a lot of different main functions in the project, @sim642 suggested that one might try to compile the commands separately. Therefore I tried analyzing the following 3 commands:
Just for fun, the first thing that came to mind:
NO_OPENSSL=1 NO_CURL=1 bear -- make git-cherry-pick
goblint compile_commands.json
Actually has pthread_create:
NO_OPENSSL=1 NO_CURL=1 bear -- make -j 11 git-grep
goblint compile_commands.json
Maybe analyzing some of the "simplest" commands with the simplest conf won't run out of memory?:
NO_OPENSSL=1 NO_CURL=1 bear -- make -j 11 git-add
goblint compile_commands.json --conf /analyzer/conf/examples/large-program.json
However, in all of my attempted runs, Goblint got killed (most possibly due to out-of-memory). @sim642 suggested that maybe one could try and run it on a server with more resources.
@sim642: #14 (comment)
I tried this overnight on our server with 32GB RAM:
NO_OPENSSL=1 NO_CURL=1 bear -- make -j 15 git-grep
goblint --conf examples/large-program.json compile_commands.json It mysteriously died after 3h20min: no "Killed", exit code 0, just stopped with no explanation.
Today I tried it again with some profiling and it crashed after the same time, this time explicitly "Killed". So feel free to try it on your own but I don't have too high expectations. At the moment of death Goblint was at this point:
runtime: 03:20:37.661
vars: 7080440, evals: 29235491
|rho|=7080440
|stable|=6572163
|infl|=7078192
|wpoint|=74592
|sides|=424929
|side_dep|=0
|side_infl|=0
|var_messages|=0
|rho_write|=0
|dep|=6226162
|called|=228
Found 415057 contexts for 1822 functions. Top 5 functions:
29556 contexts for L:entry state of vsnprintf (5042244) on /usr/include/x86_64-linux-gnu/bits/stdio2.h:81:1-87:1
24175 contexts for L:entry state of strbuf_vaddf (11685) on git/strbuf.c:393:1-412:1
18211 contexts for L:entry state of memcpy (5042287) on /usr/include/x86_64-linux-gnu/bits/string_fortified.h:25:1-31:1
16294 contexts for L:entry state of append_quoted_string (5089741) on git/json-writer.c:19:1-45:1
12448 contexts for L:entry state of jw_object_string (2135970) on git/json-writer.c:158:1-162:1
Memory statistics: total=21971776.09MB, max=35359.31MB, minor=21933689.68MB, major=401256.95MB, promoted=363170.54MB
minor collections=10459247 major collections=121 compactions=0
Even with the large-program conf which has almost no analyses and nothing but pointers in contexts, there are still tens of thousands of contexts for some functions. This is with earlyglobs where the globals aren't in the local states/contexts either.
According to the profiling, over 1h of the runtime was just spent on some silly die function (which can recurse!). In fact, there seems to be a lot of recursion in git. I also tried with --enable ana.context.widen and it seemed to make things even worse by never even getting out of analyzing some very first initialization function.
There must be some stupidity happening in Goblint if the majority of time is spent analyzing error cases where git dies instead of the actual meat of git.