Skip to content

Commit 57b6c13

Browse files
committed
Merge branch 'main' of github-toolchainz-ssh:toolCHAINZ/crackers into ablation_a
2 parents 5d65748 + af520d4 commit 57b6c13

File tree

6 files changed

+88
-142
lines changed

6 files changed

+88
-142
lines changed

README.md

Lines changed: 10 additions & 108 deletions
Original file line numberDiff line numberDiff line change
@@ -97,11 +97,20 @@ min = 0x7fffffffde00
9797
max = 0x7ffffffff000
9898
```
9999

100+
A successful synthesis will print out a listing of the gadgets were selected.
101+
100102
## Library Usage
101103

102-
`crackers` can also be used as a library. All of the above settings from the config correspond
104+
`crackers` intended mode of use is as a library. All of the above settings from the config correspond
103105
to settings that can be set programmatically by API consumers.
104106

107+
When using the API, rather than getting a listing of gadgets as an output, you get a model of the synthesized chain.
108+
This model of the chain includes information about what gadgets were selected as well as a Z3 `Model` representing the
109+
memory at all states of execution in the gadget chain. This model can be queried to derive the memory conditions
110+
necessary to execute the chain.
111+
112+
### Constraints
113+
105114
Constraints work a little differently with the API. Instead of specifying registers and register equality,
106115
`crackers` allows consumers to provide a closure of the following types:
107116

@@ -128,110 +137,3 @@ The second type is used for asserting read/write invariants. These functions tak
128137
the bitvector corresponding to the read/write address, as well as the state the read/write is being performed on.
129138
Any time a chain reads or writes from memory, the procedure will automatically call these functions and assert the returned
130139
booleans. This can allow for setting safe/unsafe ranges of memory or even the register space.
131-
132-
## How it Works (Roughly)
133-
134-
### Library generation
135-
136-
A library is taken in and parsed. The executable sections are identified. For every executable section,
137-
we attempt disassembly at every byte offset. If disassembly succeeds and returns a terminating basic block
138-
within N instructions (where N is set in the config), then we call that a gadget and save it.
139-
140-
### Candidate selection
141-
142-
`crackers` works by taking in an "example" computation and synthesizing a chain that is compatible with the example.
143-
So for instance, if you want to call execve on linux, your example computation might look like this:
144-
145-
```
146-
00000000 4889c7 mov rdi, rax
147-
00000003 48c7c03b000000 mov rax, 0x3b
148-
0000000a 48c7c600000000 mov rsi, 0x0
149-
00000011 48c7c200000000 mov rdx, 0x0
150-
00000018 0f05 syscall
151-
```
152-
153-
This computation sets `rax`, `rsi`, and `rdx` to set values, and `rdi` to some indeterminate value, which we will
154-
constrain later.
155-
156-
For each instruction in this computation, we identify a set of "gadget candidates". These candidates are selected out of
157-
the library we assembled. To be a candidate for an instruction a gadget must pass the following checks:
158-
* If the specification instruction contains a jump, the gadget must have terminating control flow that is capable of branching
159-
to the same destination.
160-
* The gadget must write to every direct address that the specification instruction writes to.
161-
* For every indirect access, the gadget must also make an indirect access using the same pointer storage, of at least
162-
as many bytes.
163-
* Taken in isolation, the gadget must be able to have the same effect as the specification instruction:
164-
(e.g. `mov eax, ebx` can stand in for `mov eax, 0`, but `mov eax, 2` cannot).
165-
166-
167-
### Decision Loop
168-
169-
The overall flow of the procedure is as follows:
170-
171-
* Ask the assignment problem for an assignment
172-
* If it returns UNSAT, then no possible assignment exists (under the given parameters) and we return
173-
* If it returns SAT, then we send that assignment to the theory solver
174-
* If the PCODE theory solver returns SAT, we have a valid chain
175-
* If the PCODE theory solver returns UNSAT, it also provides a set of conflict clauses identifying
176-
which `decisions` participated in the UNSAT proof. These clauses are communicated back to the assignment
177-
problem to allow it to outlaw that combination of `decisions`.
178-
179-
We introduce parallelism to this workflow by running the PCODE theory solvers in threads and generating multiple
180-
unique assignments for each worker to solve in parallel.
181-
182-
A description of the assignment problem and the theory problem follow:
183-
### Assignment Problem Setup
184-
185-
Once all the candidates have been found for all instructions, we check and make sure that we have found
186-
at least one candidate for each. If any instruction has no candidates, then we immediately return UNSAT. This
187-
usually indicates that we just did not sample a gadget that touches the needed memory.
188-
189-
For each spec instruction, each candidate is assigned an index. The same gadget can exist as a candidate for multiple indices and
190-
each copy is treated as logically separate from each other.
191-
192-
Using these indices, we construct a simple boolean SAT problem:
193-
* We define a `decision` as a tuple `(i: usize, c: usize)`, indicating that index `i` is using choice `c`. A `decision`
194-
uniquely identifies a given gadget being used in a given slot.
195-
* Each decision is mapped to a Z3 Bool.
196-
* We then construct a boolean SAT problem using these variables with the following constraints:
197-
* For all indices `i`:
198-
* We make exactly 1 choice `c` (e.g. for every `i`, one AND ONLY ONE `decision` with matching `i` must be true)
199-
* In the case of the `optimize` solver, we additionally impose a penalty on every `decision`, proportional to the
200-
number of instructions in the gadget. This pressures the solver into selecting the shortest gadgets that it can.
201-
202-
### PCODE Theory Problem Setup
203-
204-
This procedure runs operates on a single assignment of gadgets. This assignment is evaluated against
205-
the specification computation, as well as any provided constraints.
206-
207-
First, we form the specification computation into a trace, by asserting state equality between the end state
208-
of every instruction and the beginning state of its successor.
209-
210-
Then, we do the same for our assignment of gadgets. We tag these assertions as being `memory` assertions.
211-
212-
We assert all preconditions on the initial state of the gadget chain, and all postconditions on the final state of the gadget chain.
213-
We tag these as `constraint` assertions.
214-
215-
For every instruction `i` and its corresponding gadget `g`:
216-
* Assert that for all `v` in `output(i)`: `g[v]` = `i[v]`.
217-
* If `i` has control flow, assert that the control flow of `g` branches to the same destination as `i`
218-
* These are tagged as `semantic` assertions.
219-
220-
For every gadget `g1` and its successor `g2`:
221-
* Assert that the address of `g2` is the jump target of `g1`. These are tagged as `branch` assertions. The conflict
222-
associated with a `branch` assertions only references `g1` instead of the conjunction of `g1` and `g2` because,
223-
as a heuristic, when `g1` is unable to branch to `g2` it is almost always because of some conflict in `g1`, and not
224-
something about the address of `g2`
225-
226-
We give all these assertions to `z3` using the `assert_and_track` API, which makes Z3 express the unsat core
227-
in terms of booleans representing our varying assertions.
228-
229-
If z3 comes back with SAT, then the chain assignment is valid.
230-
231-
If it comes back UNSAT, then we analyze the UNSAT CORE:
232-
233-
* If the UNSAT core is composed only of `memory` and `constraint` assertions, then, as the formulae are currently tracked,
234-
we have no way to make strong conflicts ouf of this core. As a fallback, we simply return a clause outlawing the complete
235-
assignment.
236-
* Otherwise, we form a conjunction of all participating `decisions` for all `branch` and `semantic` conflicts
237-
and return that to the assignment problem.

libc_execve.toml

Lines changed: 1 addition & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ log_level = "INFO"
55
[synthesis]
66
strategy = "sat"
77
max_candidates_per_slot = 200
8-
parallel = 7
8+
parallel = 1
99

1010
[specification]
1111
path = "bin/execve.o"
@@ -33,22 +33,5 @@ R13 = 0x0
3333
R14 = 0x4ae018
3434
R15 = 0x400538
3535

36-
[constraint.postcondition.register]
37-
RAX = 0x3b
38-
RSI = 0
39-
RDX = 0
40-
4136
[constraint.postcondition.pointer]
4237
RDI = "/bin/sh"
43-
44-
[[constraint.pointer.read]]
45-
min = 0x7ffffffde000
46-
max = 0x7ffffffff000
47-
48-
[[constraint.pointer.write]]
49-
min = 0x7ffffffde000
50-
max = 0x7ffffffff000
51-
52-
[[constraint.pointer.write]]
53-
min = 0x9d000
54-
max = 0x9d060

mprotect.toml

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
[meta]
2+
log_level = "DEBUG"
3+
4+
[specification]
5+
max_instructions = 4
6+
path = "bin/execve.o"
7+
8+
[library]
9+
max_gadget_length = 10
10+
path = "bin/libc_wrapper"
11+
12+
[sleigh]
13+
ghidra_path = "/Applications/ghidra"
14+
15+
[synthesis]
16+
strategy = "sat"
17+
max_candidates_per_slot = 500
18+
parallel = 1
19+
combine_instructions = false
20+
21+
[constraint.precondition.register]
22+
RAX = 0
23+
RCX = 0x440f30
24+
RDX = 0x7fffffffe608
25+
RBX = 0x400538
26+
RSP = 0x7fffffffe3b8
27+
RBP = 0x403af0
28+
RSI = 0x7fffffffe5f8
29+
RDI = 1
30+
R8 = 0
31+
R9 = 6
32+
R10 = 0x36f8
33+
R11 = 0x206
34+
R12 = 0x403b90
35+
R13 = 0x0
36+
R14 = 0x4ae018
37+
R15 = 0x400538
38+
39+
[constraint.postcondition.register]
40+
RDI = 0xdeadbeef
41+
RSI = 0x40
42+
RDX = 0x7b
43+
44+
[[constraint.pointer.read]]
45+
min = 0x7ffffffde000
46+
max = 0x7ffffffff000
47+
48+
[[constraint.pointer.write]]
49+
min = 0x7ffffffde000
50+
max = 0x7ffffffff000

src/bin/crackers/main.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ fn new(path: PathBuf) -> anyhow::Result<()> {
5858
specification: SpecificationConfig {
5959
path: "spec.o".to_string(),
6060
max_instructions: 1,
61+
base_address: None,
6162
},
6263
library: Default::default(),
6364
sleigh: SleighConfig {

src/config/constraint.rs

Lines changed: 17 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
11
use crate::error::CrackersError;
22
use crate::synthesis::builder::{StateConstraintGenerator, TransitionConstraintGenerator};
33
use jingle::modeling::{ModeledBlock, ModelingContext, State};
4-
use jingle::sleigh::{IndirectVarNode, RegisterManager, SpaceManager, VarNode};
4+
use jingle::sleigh::{RegisterManager, SpaceManager, VarNode};
55
use jingle::varnode::{ResolvedIndirectVarNode, ResolvedVarnode};
66
use jingle::JingleContext;
77
use serde::{Deserialize, Serialize};
88
use std::collections::HashMap;
9+
use std::ops::Add;
910
use std::sync::Arc;
1011
use tracing::{event, Level};
1112
use z3::ast::{Ast, Bool, BV};
@@ -159,25 +160,28 @@ pub fn gen_register_pointer_constraint<'ctx>(
159160
{
160161
move |jingle, state, _addr| {
161162
let m = m.clone();
162-
let val = value
163-
.as_bytes()
164-
.iter()
165-
.map(|b| BV::from_u64(jingle.z3, *b as u64, 8))
166-
.reduce(|a, b| a.concat(&b))
167-
.unwrap();
163+
let mut bools = vec![];
164+
let pointer = state.read_varnode(&vn)?;
165+
for (i,byte) in value.as_bytes().iter().enumerate() {
166+
let expected = BV::from_u64(jingle.z3, *byte as u64, 8);
167+
let char_ptr = ResolvedVarnode::Indirect(ResolvedIndirectVarNode{
168+
// dumb but whatever
169+
pointer_location: vn.clone(),
170+
pointer: pointer.clone().add(i as u64),
171+
access_size_bytes: 1,
172+
pointer_space_idx: state.get_code_space_idx()
173+
});
174+
let actual = state.read_resolved(&char_ptr)?;
175+
bools.push(actual._eq(&expected))
176+
}
168177
let pointer = state.read_varnode(&vn)?;
169-
let data = state.read_varnode_indirect(&IndirectVarNode {
170-
pointer_space_index: state.get_code_space_idx(),
171-
access_size_bytes: value.len(),
172-
pointer_location: vn.clone(),
173-
})?;
174178
let resolved = ResolvedVarnode::Indirect(ResolvedIndirectVarNode {
175179
pointer_location: vn.clone(),
176180
pointer_space_idx: state.get_code_space_idx(),
177181
access_size_bytes: value.len(),
178182
pointer,
179183
});
180-
let mut constraint = data._eq(&val);
184+
let mut constraint = Bool::and(jingle.z3, &bools);
181185
if let Some(c) = m.and_then(|m| m.read) {
182186
let callback = gen_pointer_range_state_invariant(c);
183187
let cc = callback(jingle, &resolved, state)?;

src/config/specification.rs

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,11 @@ use crate::config::sleigh::SleighConfig;
1414
pub struct SpecificationConfig {
1515
pub path: String,
1616
pub max_instructions: usize,
17+
pub base_address: Option<u64>,
1718
}
1819

1920
impl SpecificationConfig {
20-
pub fn load_sleigh<'a>(
21+
fn load_sleigh<'a>(
2122
&self,
2223
sleigh_config: &'a SleighConfig,
2324
) -> Result<LoadedSleighContext<'a>, CrackersConfigError> {
@@ -36,9 +37,14 @@ impl SpecificationConfig {
3637
let _section = gimli_file
3738
.section_by_name(".text")
3839
.ok_or(SpecMissingTextSection)?;
39-
let sleigh = self.load_sleigh(sleigh_config)?;
40+
let mut sleigh = self.load_sleigh(sleigh_config)?;
41+
let mut addr = sym.address();
42+
if let Some(o) = self.base_address {
43+
sleigh.set_base_address(o);
44+
addr = addr.wrapping_add(o);
45+
}
4046
let instrs: Vec<Instruction> = sleigh
41-
.read_until_branch(sym.address(), self.max_instructions)
47+
.read_until_branch(addr, self.max_instructions)
4248
.collect();
4349
Ok(instrs)
4450
}

0 commit comments

Comments
 (0)