Skip to content

[RFC] Two new failing tests demonstrating updates outside member bounds #6101

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions regression/cbmc/member_bounds1/main.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#include <assert.h>

#pragma pack(push, 1)
struct S
{
int a[2];
int x;
};
#pragma pack(pop)

#ifdef _MSC_VER
# define _Static_assert(x, m) static_assert(x, m)
#endif

int main()
{
int A[3];
_Static_assert(sizeof(A) == sizeof(struct S), "");
struct S *s = A;
s->a[2] = 42;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this is undefined behaviour...

assert(*((int *)s + 2) == 42);
assert(s->x == 42);
assert(A[2] == 42);
}
13 changes: 13 additions & 0 deletions regression/cbmc/member_bounds1/test.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
CORE broken-smt-backend
main.c
--no-simplify --symex-no-member-bounds
^VERIFICATION SUCCESSFUL$
^EXIT=0$
^SIGNAL=0$
--
^warning: ignoring
--
This test passes when using the simplifier, but results in "WITH" expressions
that update elements outside their bounds without the simplifier.
--symex-no-member-bounds ensures that such expressions are rewritten by
goto-symex.
19 changes: 19 additions & 0 deletions regression/cbmc/member_bounds2/main.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#include <assert.h>

#pragma pack(push, 1)
struct SU
{
union {
int a[2];
} u;
int x;
};
#pragma pack(pop)

int main()
{
struct SU su;
su.u.a[2] = 42;
assert(*((int *)&su + 2) == 42);
assert(su.x == 42);
}
13 changes: 13 additions & 0 deletions regression/cbmc/member_bounds2/test.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
KNOWNBUG
main.c

^VERIFICATION SUCCESSFUL$
^EXIT=0$
^SIGNAL=0$
--
^warning: ignoring
--
Field sensitivity makes it (currently) impossible to update bytes outside the
particular field. One possible way to address this is rewriting subscripted
array accesses to pointer dereferencing, as the C standard describes. Here, this
would amount to using *(&su.u.a[0] + 2) instead of su.u.a[2].
2 changes: 1 addition & 1 deletion scripts/expected_doxygen_warnings.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
warning: Include graph for 'goto_instrument_parse_options.cpp' not generated, too many nodes (93), threshold is 60. Consider increasing DOT_GRAPH_MAX_NODES.
warning: Included by graph for 'goto_model.h' not generated, too many nodes (95), threshold is 60. Consider increasing DOT_GRAPH_MAX_NODES.
warning: Included by graph for 'arith_tools.h' not generated, too many nodes (167), threshold is 60. Consider increasing DOT_GRAPH_MAX_NODES.
warning: Included by graph for 'arith_tools.h' not generated, too many nodes (168), threshold is 60. Consider increasing DOT_GRAPH_MAX_NODES.
warning: Included by graph for 'c_types.h' not generated, too many nodes (128), threshold is 60. Consider increasing DOT_GRAPH_MAX_NODES.
warning: Included by graph for 'config.h' not generated, too many nodes (78), threshold is 60. Consider increasing DOT_GRAPH_MAX_NODES.
warning: Included by graph for 'invariant.h' not generated, too many nodes (172), threshold is 60. Consider increasing DOT_GRAPH_MAX_NODES.
Expand Down
3 changes: 3 additions & 0 deletions src/cbmc/cbmc_parse_options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -345,6 +345,9 @@ void cbmc_parse_optionst::get_command_line_options(optionst &options)
options.set_option(
"symex-cache-dereferences", cmdline.isset("symex-cache-dereferences"));

options.set_option(
"symex-no-member-bounds", cmdline.isset("symex-no-member-bounds"));

if(cmdline.isset("incremental-loop"))
{
options.set_option(
Expand Down
2 changes: 2 additions & 0 deletions src/goto-checker/bmc_util.h
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,7 @@ void run_property_decider(
"(unwind-max):" \
"(ignore-properties-before-unwind-min)" \
"(symex-cache-dereferences)" \
"(symex-no-member-bounds)" \

#define HELP_BMC \
" --paths [strategy] explore paths one at a time\n" \
Expand Down Expand Up @@ -252,6 +253,7 @@ void run_property_decider(
" gets blacklisted\n" \
" --graphml-witness filename write the witness in GraphML format to filename\n" /* NOLINT(*) */ \
" --symex-cache-dereferences enable caching of repeated dereferences" \
" --symex-no-member-bounds support updates beyond bounds of a compound member" /* NOLINT(*) */ \
// clang-format on

#endif // CPROVER_GOTO_CHECKER_BMC_UTIL_H
171 changes: 162 additions & 9 deletions src/goto-symex/symex_assign.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,14 @@ Author: Daniel Kroening, [email protected]

#include "symex_assign.h"

#include "expr_skeleton.h"
#include "goto_symex_state.h"
#include <util/arith_tools.h>
#include <util/byte_operators.h>
#include <util/expr_util.h>
#include <util/pointer_offset_size.h>
#include <util/range.h>

#include "expr_skeleton.h"
#include "goto_symex_state.h"
#include "symex_config.h"

// We can either use with_exprt or update_exprt when building expressions that
Expand Down Expand Up @@ -253,11 +255,100 @@ void symex_assignt::assign_array(
{
const exprt &lhs_array=lhs.array();
const exprt &lhs_index=lhs.index();
const typet &lhs_index_type = lhs_array.type();
const array_typet &lhs_array_type = to_array_type(lhs_array.type());

const bool may_be_out_of_bounds =
symex_config.updates_across_member_bounds &&
[&lhs_index, &lhs_array_type]() {
if(
lhs_index.id() != ID_constant ||
lhs_array_type.size().id() != ID_constant)
{
return true;
}

const auto lhs_index_int =
numeric_cast_v<mp_integer>(to_constant_expr(lhs_index));

return lhs_index_int < 0 ||
lhs_index_int >= numeric_cast_v<mp_integer>(
to_constant_expr(lhs_array_type.size()));
}();

if(
may_be_out_of_bounds &&
(lhs_array.id() == ID_member || lhs_array.id() == ID_index))
{
const auto subtype_bytes_opt = size_of_expr(lhs_array.type().subtype(), ns);
CHECK_RETURN(subtype_bytes_opt.has_value());

PRECONDITION(lhs_index_type.id() == ID_array);
exprt new_offset = mult_exprt{
lhs_index,
typecast_exprt::conditional_cast(*subtype_bytes_opt, lhs_index.type())};

if(use_update)
exprt parent;
if(lhs_array.id() == ID_member)
{
const member_exprt &member = to_member_expr(lhs_array);
const auto member_offset = member_offset_expr(member, ns);
CHECK_RETURN(member_offset.has_value());

parent = member.compound();
new_offset = plus_exprt{
typecast_exprt::conditional_cast(*member_offset, new_offset.type()),
new_offset};
}
else
{
const index_exprt &index = to_index_expr(lhs_array);

const auto element_size_opt =
size_of_expr(to_array_type(index.array().type()).subtype(), ns);
CHECK_RETURN(element_size_opt.has_value());

parent = index.array();
new_offset =
plus_exprt{mult_exprt{typecast_exprt::conditional_cast(
*element_size_opt, new_offset.type()),
typecast_exprt::conditional_cast(
index.index(), new_offset.type())},
new_offset};
}

if(symex_config.simplify_opt)
simplify(new_offset, ns);

const byte_update_exprt new_rhs = make_byte_update(parent, new_offset, rhs);
const expr_skeletont new_skeleton =
full_lhs.compose(expr_skeletont::remove_op0(lhs_array));
assign_rec(parent, new_skeleton, new_rhs, guard);
}
else if(
may_be_out_of_bounds && (lhs_array.id() == ID_byte_extract_big_endian ||
lhs_array.id() == ID_byte_extract_little_endian))
{
const byte_extract_exprt &byte_extract = to_byte_extract_expr(lhs_array);

const auto subtype_bytes_opt = size_of_expr(lhs_array.type().subtype(), ns);
CHECK_RETURN(subtype_bytes_opt.has_value());

exprt new_offset =
plus_exprt{mult_exprt{lhs_index,
typecast_exprt::conditional_cast(
*subtype_bytes_opt, lhs_index.type())},
typecast_exprt::conditional_cast(
byte_extract.offset(), lhs_index.type())};

if(symex_config.simplify_opt)
simplify(new_offset, ns);

byte_extract_exprt new_lhs = byte_extract;
new_lhs.offset() = new_offset;
new_lhs.type() = rhs.type();

assign_rec(new_lhs, full_lhs, rhs, guard);
}
else if(use_update)
{
// turn
// a[i]=e
Expand Down Expand Up @@ -378,8 +469,70 @@ void symex_assignt::assign_byte_extract(
else
UNREACHABLE;

const byte_update_exprt new_rhs{byte_update_id, lhs.op(), lhs.offset(), rhs};
const expr_skeletont new_skeleton =
full_lhs.compose(expr_skeletont::remove_op0(lhs));
assign_rec(lhs.op(), new_skeleton, new_rhs, guard);
const bool may_be_out_of_bounds =
symex_config.updates_across_member_bounds && [&lhs, this]() {
if(lhs.offset().id() != ID_constant)
return true;
const auto extract_size_opt = pointer_offset_size(lhs.type(), ns);
if(!extract_size_opt.has_value())
return true;
const auto object_size_opt = pointer_offset_size(lhs.op().type(), ns);
if(!object_size_opt.has_value())
return true;
const auto lhs_offset_int =
numeric_cast_v<mp_integer>(to_constant_expr(lhs.offset()));
return lhs_offset_int < 0 ||
lhs_offset_int + *extract_size_opt > *object_size_opt;
}();

if(
may_be_out_of_bounds &&
(lhs.op().id() == ID_member || lhs.op().id() == ID_index))
{
exprt new_offset = lhs.offset();
exprt parent;
if(lhs.op().id() == ID_member)
{
const member_exprt &member = to_member_expr(lhs.op());
const auto member_offset = member_offset_expr(member, ns);
CHECK_RETURN(member_offset.has_value());

parent = member.compound();
new_offset = plus_exprt{
typecast_exprt::conditional_cast(*member_offset, new_offset.type()),
new_offset};
}
else
{
const index_exprt &index = to_index_expr(lhs.op());

const auto element_size_opt =
size_of_expr(to_array_type(index.array().type()).subtype(), ns);
CHECK_RETURN(element_size_opt.has_value());

parent = index.array();
new_offset =
plus_exprt{mult_exprt{typecast_exprt::conditional_cast(
*element_size_opt, new_offset.type()),
typecast_exprt::conditional_cast(
index.index(), new_offset.type())},
new_offset};
}

if(symex_config.simplify_opt)
simplify(new_offset, ns);

const byte_update_exprt new_rhs{byte_update_id, parent, new_offset, rhs};
const expr_skeletont new_skeleton =
full_lhs.compose(expr_skeletont::remove_op0(lhs.op()));
assign_rec(parent, new_skeleton, new_rhs, guard);
}
else
{
const byte_update_exprt new_rhs{
byte_update_id, lhs.op(), lhs.offset(), rhs};
const expr_skeletont new_skeleton =
full_lhs.compose(expr_skeletont::remove_op0(lhs));
assign_rec(lhs.op(), new_skeleton, new_rhs, guard);
}
}
4 changes: 4 additions & 0 deletions src/goto-symex/symex_config.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ struct symex_configt final
/// Used in goto_symext::dereference_rec
bool cache_dereferences;

/// Handle assignments beyond the bounds prescribed by a struct or array
/// access path. See tests in regression/cbmc/member_bounds* for examples.
bool updates_across_member_bounds;

/// \brief Construct a symex_configt using options specified in an
/// \ref optionst
explicit symex_configt(const optionst &options);
Expand Down
4 changes: 3 additions & 1 deletion src/goto-symex/symex_main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,9 @@ symex_configt::symex_configt(const optionst &options)
: DEFAULT_MAX_FIELD_SENSITIVITY_ARRAY_SIZE),
complexity_limits_active(
options.get_signed_int_option("symex-complexity-limit") > 0),
cache_dereferences{options.get_bool_option("symex-cache-dereferences")}
cache_dereferences{options.get_bool_option("symex-cache-dereferences")},
updates_across_member_bounds{
options.get_bool_option("symex-no-member-bounds")}
{
}

Expand Down
6 changes: 1 addition & 5 deletions src/solvers/lowering/byte_operators.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2057,11 +2057,7 @@ static exprt lower_byte_update(
subtype = to_array_type(src.type()).subtype();

auto element_width = pointer_offset_bits(*subtype, ns);
CHECK_RETURN(element_width.has_value());
CHECK_RETURN(*element_width > 0);
CHECK_RETURN(*element_width % 8 == 0);

if(*element_width == 8)
if(element_width.has_value() && *element_width == 8)
{
if(value_as_byte_array.id() != ID_array)
{
Expand Down