Don't generate unnecessary fresh symbols for the GOTO trace #7021

tautschnig · 2022-07-21T10:29:13Z

We can safely record the values of expressions by adding expr == expr
as constraints in order to be able to fetch and display them in the GOTO
trace. This was already being done for declarations. Introducing new
symbols just adds unnecessary variables to the formula.

When running on various proofs done for AWS open-source projects, this
changes the performance as follows: with CaDiCaL as back-end, the total
solver time for the hardest 46 proofs changes from 26546.5 to 26779.7
seconds (233.2 seconds slow-down); with Minisat, however, the hardest 49
proofs take 28420.4 instead of 32387.2 seconds (3966.8 seconds
speed-up). Across these benchmarks, 1.7% of variables and 0.6% of
clauses are removed.

Each commit message has a non-empty body, explaining why the change was made.
n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

codecov · 2022-07-21T11:45:10Z

Codecov Report

Base: 78.26% // Head: 77.86% // Decreases project coverage by -0.39% ⚠️

Coverage data is based on head (30c9bce) compared to base (557624c).
Patch coverage: 95.83% of modified lines in pull request are covered.

❗ Current head 30c9bce differs from pull request most recent head 81afb57. Consider uploading reports for the commit 81afb57 to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #7021      +/-   ##
===========================================
- Coverage    78.26%   77.86%   -0.40%     
===========================================
  Files         1642     1569      -73     
  Lines       189830   180969    -8861     
===========================================
- Hits        148568   140913    -7655     
+ Misses       41262    40056    -1206

Impacted Files	Coverage Δ
src/solvers/smt2/smt2_conv.cpp	`68.78% <90.90%> (+2.39%)`	⬆️
src/goto-symex/build_goto_trace.cpp	`87.00% <100.00%> (-0.87%)`	⬇️
src/goto-symex/symex_target_equation.cpp	`95.04% <100.00%> (-0.34%)`	⬇️
src/goto-programs/remove_unused_functions.cpp	`0.00% <0.00%> (-100.00%)`	⬇️
src/crangler/ctokenit.h	`0.00% <0.00%> (-83.34%)`	⬇️
src/crangler/ctokenit.cpp	`0.00% <0.00%> (-82.76%)`	⬇️
src/crangler/c_defines.cpp	`44.00% <0.00%> (-48.00%)`	⬇️
src/solvers/sat/cnf_clause_list.h	`36.00% <0.00%> (-44.00%)`	⬇️
src/solvers/sat/external_sat.cpp	`57.14% <0.00%> (-30.96%)`	⬇️
src/assembler/remove_asm.cpp	`51.37% <0.00%> (-27.29%)`	⬇️
... and 344 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Avoid creating equalities over the postponed bitvector when the object literals trivially aren't equal, and directly encode bitwise equality when the object literals are trivially equal (and stop searching for a matching object). In all other cases, avoid unnecessary Tseitin variables to encode the postponed bitvector equality. When running on various proofs done for AWS open-source projects, this changes the performance as follows (when comparing to diffblue#7021): with CaDiCaL as back-end, the total solver time for the hardest 46 proofs changes from 26779.7 to 22409.9 seconds (4369.8 seconds speed-up); with Minisat, however, the hardest 49 proofs take 28616.7 instead of 28420.4 seconds (196.3 seconds slow-down). Across these benchmarks, 11.7% of variables and 12.8% of clauses are removed.

This was previously disabled as it appeared to degrade performance. New benchmarking, however, suggests considerable performance improvement: When running on various proofs done for AWS open-source projects, this changes the performance as follows (when comparing to diffblue#7021): with CaDiCaL as back-end, the total solver time for the hardest 46 proofs changes from 26779.7 to 24472.6 seconds (2307.1 seconds speed-up); with Minisat the hardest 49 proofs take 18541.2 instead of 28420.4 seconds (9879.2 seconds speed-up). Across these benchmarks, 1.0% of variables and 3.2% of clauses are removed.

Avoid creating equalities over the postponed bitvector when the object literals trivially aren't equal, and directly encode bitwise equality when the object literals are trivially equal (and stop searching for a matching object). In all other cases, avoid unnecessary Tseitin variables to encode the postponed bitvector equality. When running on various proofs done for AWS open-source projects, this changes the performance as follows (when comparing to diffblue#7021): with CaDiCaL as back-end, the total solver time for the hardest 46 proofs changes from 26779.7 to 22409.9 seconds (4369.8 seconds speed-up); with Minisat, however, the hardest 49 proofs take 28616.7 instead of 28420.4 seconds (196.3 seconds slow-down). Across these benchmarks, 11.7% of variables and 12.8% of clauses are removed.

martin-cs · 2022-08-13T11:11:37Z

I love the performance graphs but please can we have the diagonal line so it is easy to see better / worse.

martin-cs

It is a little hard to follow how the changes implement what is described in the PR.

regression/cbmc-incr-smt2/pointers-conversions/pointer_to_int.desc

This was previously disabled as it appeared to degrade performance. New benchmarking, however, suggests considerable performance improvement: When running on various proofs done for AWS open-source projects, this changes the performance as follows (when comparing to diffblue#7021): with CaDiCaL as back-end, the total solver time for the hardest 46 proofs changes from 26779.7 to 24472.6 seconds (2307.1 seconds speed-up); with Minisat the hardest 49 proofs take 18541.2 instead of 28420.4 seconds (9879.2 seconds speed-up). Across these benchmarks, 1.0% of variables and 3.2% of clauses are removed.

This was previously disabled as it appeared to degrade performance. New benchmarking, however, suggests considerable performance improvement: When running on various proofs done for AWS open-source projects, this changes the performance as follows (when comparing to diffblue#7021): with CaDiCaL as back-end, the total solver time for the hardest 46 proofs changes from 26779.7 to 24472.6 seconds (2307.1 seconds speed-up); with Minisat the hardest 49 proofs take 18541.2 instead of 28420.4 seconds (9879.2 seconds speed-up). Across these benchmarks, 1.0% of variables and 3.2% of clauses are removed. INCLUDE_REDUNDANT_CLAUSES is not enabled: while this would yield a further speed-up of 1080.4 seconds with CaDiCaL, it slows down Minisat by 4440.6 seconds on the above benchmark set.

kroening · 2022-08-17T23:12:04Z

This is not a good idea.

I'd first like to understand how the speedup works. And then I'd like to push that change into the decision procedure. The user of the decision procedure (here symex) shouldn't have to care about encoding tricks that are this low level.

We can safely record the values of expressions by adding `expr == expr` as constraints in order to be able to fetch and display them in the GOTO trace. This was already being done for declarations. Introducing new symbols just adds unnecessary variables to the formula. When running on various proofs done for AWS open-source projects, this changes the performance as follows: with CaDiCaL as back-end, the total solver time for the hardest 46 proofs changes from 26546.5 to 26779.7 seconds (233.2 seconds slow-down); with Minisat, however, the hardest 49 proofs take 28420.4 instead of 32387.2 seconds (3966.8 seconds speed-up). Across these benchmarks, 1.7% of variables and 0.6% of clauses are removed.

tautschnig force-pushed the cleanup/no-unnecessry-fresh-symbols branch from a0295ee to 537ebd6 Compare July 22, 2022 11:24

tautschnig marked this pull request as ready for review July 22, 2022 11:33

tautschnig requested review from kroening, martin-cs, peterschrammel, thomasspriggs, NlightNFotis, TGWDB and romainbrenguier as code owners July 22, 2022 11:34

tautschnig assigned kroening, peterschrammel, martin-cs, NlightNFotis, thomasspriggs and TGWDB Jul 22, 2022

tautschnig added the Performance Optimisations label Jul 22, 2022

tautschnig force-pushed the cleanup/no-unnecessry-fresh-symbols branch from 537ebd6 to 5d892f4 Compare July 22, 2022 14:33

tautschnig mentioned this pull request Jul 22, 2022

Optimise propositional encoding of object_size #7029

Merged

4 tasks

tautschnig mentioned this pull request Aug 11, 2022

Enable compact less-than propositional encoding #7068

Merged

4 tasks

tautschnig force-pushed the cleanup/no-unnecessry-fresh-symbols branch from 5d892f4 to 30c9bce Compare August 12, 2022 11:58

martin-cs approved these changes Aug 13, 2022

View reviewed changes

regression/cbmc-incr-smt2/pointers-conversions/pointer_to_int.desc Show resolved Hide resolved

tautschnig assigned tautschnig and unassigned martin-cs Aug 13, 2022

tautschnig unassigned kroening, NlightNFotis, peterschrammel, thomasspriggs and TGWDB Oct 4, 2022

tautschnig marked this pull request as draft November 8, 2022 14:21

tautschnig added the dependent - do not merge label Nov 8, 2022

tautschnig force-pushed the cleanup/no-unnecessry-fresh-symbols branch 2 times, most recently from 57d6542 to 81afb57 Compare November 9, 2022 09:28

tautschnig changed the title ~~Don't generate unnecessary fresh symbols for the GOTO trace~~ Don't generate unnecessary fresh symbols for the GOTO trace [depends-on: #7310] Nov 9, 2022

tautschnig changed the title ~~Don't generate unnecessary fresh symbols for the GOTO trace [depends-on: #7310]~~ Don't generate unnecessary fresh symbols for the GOTO trace [depends-on: #7323] Nov 15, 2022

tautschnig removed the dependent - do not merge label Jan 11, 2023

tautschnig changed the title ~~Don't generate unnecessary fresh symbols for the GOTO trace [depends-on: #7323]~~ Don't generate unnecessary fresh symbols for the GOTO trace Jan 11, 2023

tautschnig force-pushed the cleanup/no-unnecessry-fresh-symbols branch from 81afb57 to c7f67c6 Compare July 24, 2024 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Don't generate unnecessary fresh symbols for the GOTO trace #7021

Don't generate unnecessary fresh symbols for the GOTO trace #7021

Uh oh!

tautschnig commented Jul 21, 2022

Uh oh!

codecov bot commented Jul 21, 2022 •

edited

Loading

Uh oh!

martin-cs commented Aug 13, 2022

Uh oh!

martin-cs left a comment

Uh oh!

Uh oh!

kroening commented Aug 17, 2022

Uh oh!

Uh oh!

Don't generate unnecessary fresh symbols for the GOTO trace #7021

Are you sure you want to change the base?

Don't generate unnecessary fresh symbols for the GOTO trace #7021

Uh oh!

Conversation

tautschnig commented Jul 21, 2022

Uh oh!

codecov bot commented Jul 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

martin-cs commented Aug 13, 2022

Uh oh!

martin-cs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kroening commented Aug 17, 2022

Uh oh!

Uh oh!

codecov bot commented Jul 21, 2022 •

edited

Loading