Skip to content

Commit f3ce412

Browse files
committed
Merge branch 'main' into aardfront3
2 parents 7507b74 + c92839d commit f3ce412

File tree

14 files changed

+231
-143
lines changed

14 files changed

+231
-143
lines changed

book/src/SUMMARY.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,9 @@
1616
- [Operations](./operations.md)
1717
- [Simple example](./simpleexample.md)
1818
- [Custom statements and custom operations](./custom.md)
19-
- [Version 1](./custom1.md)
20-
- [Version 1, written again](./custom1b.md)
19+
- [Defining custom predicates](./custompred.md)
2120
- [Custom statement example](./customexample.md)
21+
- [How to hash a custom statement](./customhash.md)
2222
- [POD types](./podtypes.md)
2323
- [SignedPOD](./signedpod.md)
2424
- [MainPOD](./mainpod.md)

book/src/backendtypes.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1 +1,41 @@
11
# Backend types
2+
3+
On the backend, there is only a single type: `Value`.
4+
5+
A `Value` is simply a tuple of field elements. With the plonky2 backend, a `Value` is a tuple of 4 field elements. In general, the backend will expose a constant `VALUE_SIZE`, and a `Value` will be a tuple of `VALUE_SIZE` field elements.
6+
7+
## Integers and booleans
8+
9+
The backend encoding stores integers in such a way that arithmetic operations (addition, multiplication, comparison) are inexpensive to verify in-circuit.
10+
11+
In the case of the Plonky2 backend, an integer $x$ is decomposed as
12+
$$x = x_0 + x_1 \cdot 2^{32}$$
13+
with $0 \leq x_0, x_1 < 2^{32}$ and represented as
14+
$$\texttt{map}\ \iota\ [x_0, x_1, 0, 0],$$
15+
where $\iota:\mathbb{N}\cup\{0\}\rightarrow\texttt{GoldilocksField}$ is the canonical projection.
16+
17+
On the backend, a boolean is stored as an integer, either 0 or 1; so logical operations on booleans are also inexpensive.
18+
19+
## Strings
20+
21+
The backend encoding stores strings as hashes, using a hash function that might not be zk-friendly. For this reason, string operations (substrings, accessing individual characters) are hard to verify in-circuit. The POD2 system does not provide methods for manipulating strings.
22+
23+
In other words: As POD2 sees it, two strings are either equal or not equal. There are no other relationships between strings.
24+
25+
In the case of the Plonky2 backend, a string is converted to a sequence of bytes with the byte `0x01` appended as padding, then the bytes are split into 7-byte chunks starting from the left, these chunks then being interpreted as integers in little-endian form, each of which is naturally an element of `GoldilocksField`, whence the resulting sequence may be hashed via the Poseidon hash function. Symbolically, given a string $s$, its hash is defined by
26+
27+
$$\texttt{poseidon}(\texttt{map}\ (\iota\circ\jmath_\texttt{le-bytes->int})\ \texttt{chunks}_7(\jmath_\texttt{string->bytes}(s)\ \texttt{++}\ [\texttt{0x01}])),$$
28+
29+
where `poseidon` is the Poseidon instance used by Plonky2, $\iota$ is as above, $\texttt{chunks}_{n}:[\texttt{u8}]\rightarrow [[\texttt{u8}]]$ is defined such that[^aux]
30+
31+
$$\texttt{chunks}_n(v) = \textup{if}\ v = [\ ]\ \textup{then}\ [\ ]\ \textup{else}\ [\texttt{take}_n v]\ \texttt{++}\ \texttt{chunks}_n(\texttt{drop}_n v),$$
32+
33+
the mapping $\jmath_\texttt{le-bytes->int}: [u8] \rightarrow{N}\cup\{0\}$ is given by
34+
35+
$$[b_0,\dots,b_{N-1}]\mapsto \sum_{i=0}^{N-1} b_i \cdot 2^{8i},$$
36+
37+
and $\jmath_\texttt{string->bytes}$ is the canonical mapping of a string to its UTF-8 representation.
38+
39+
## Compound types
40+
41+
The three front-end compound types (`Dictionary`, `Array`, `Set`) are all represented as Merkle roots on the backend. The details of the representation are explained on a separate [Merkle tree](./merkletree.md) page.

book/src/custom.md

Lines changed: 26 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,33 @@ At the backend level, every definition of a predicate is either a conjunction or
1010

1111
On the backend, custom predicates are defined in _groups_. A group can contain one or more custom predicates and their associated sub-predicates. Recursive definition is only possible within a group: the definition of a predicate in a group can only depend on previously existing predicates, itself, and other predicates in the same group.
1212

13-
## Custom predicates and their IDs
13+
## Arguments of custom predicates
14+
15+
The definition of a custom predicate might also be called an _operation_ or _deduction rule_. It includes two (or, potentially, say, five) statement templates as conditions. The arguments to the statement templates are decomposed as (origin, key) pairs: if statements are allowed to have arity at most 4, then the statement templates in a deduction rule will have at most 8 arguments (4 origins and 4 keys). The same holds for the output statement.
16+
17+
Each argument (origin or key) to an statement template is either a wildcard or a literal. In the backend, the wildcard arguments will be identified as *1, *2, *3, ....
18+
19+
## Examples
20+
21+
See [examples](./customexample.md)
22+
23+
## Hashing and predicate IDs
24+
25+
Each custom predicate is defined as part of a _group_ of predicates. The definitions of all statements in the group are laid out consecutively (see [examples](./customexample.md)) and hashed. For more details, see the pages on [hashing custom statements](./customhash.md) and [custom predicates](./custompred.md).
26+
27+
## How to prove an application of an operation
28+
29+
The POD proof format is inspired by "two-column proofs" (for an example, see [Wikipedia](https://en.wikipedia.org/wiki/Mathematical_proof)). A POD contains a "tabular proof", in which each row includes a "statement" and an "operation". The "operation" is the "reason" that justifies the statement: it is everything the circuit needs as a witness to verify the statement.
30+
31+
For a custom statement, the "reason" includes the following witnesses and verifications:
32+
- the definition of the statement, serialized (see [examples](./customexample.md))
33+
- if the statement is part of a group, the definition of the full group, serialized
34+
- verify that the hash of the definition is the statement ID
35+
- the definition will have some number of "wildcards" (*1, *2, ...) as arguments to statement templates; a value for each wildcard must be provided as a witness (each will be either an origin ID or key)
36+
- the circuit must substitute the claimed values for the wildcards, and the resulting statements (true statements with origins and keys) will appear as witnesses
37+
- the circuit must verify that all the input statement templates (with origins and keys) appear in the previous statements (in higher rows of the table)
38+
- the circuit also substitutes the claimed values for the wildcards in the output statement, and verifies that it matches the claimed output statement
1439

15-
A custom predicate, like a built-in predicate, is identified by a _name_ on the front end and an _identifier_ on the back end. The identifier is a cryptographic hash of the definition of the group.
1640

1741

1842
[^inherit]: What to call this? One POD "inherits" from another?

book/src/custom1b.md

Lines changed: 0 additions & 36 deletions
This file was deleted.

book/src/customhash.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
# How to hash a custom predicate
2+
3+
Every predicate, native or custom, is identified on the backend by a predicate ID.
4+
5+
The native predicates are numbered with small integers, sequentially. The ID of a custom predicate is a hash of its definition; this guarantees that two different predicates cannot have the same ID (aside from the miniscule probability of a hash collision).
6+
7+
This document explains in some detail how the definition of a custom predicate is serialized and hashed.
8+
9+
Custom predicates are defined in _groups_ (also known as _batches_); see an [example](./customexample.md). The definition of a custom predicate in a group involves other predicates, which may include:
10+
- native predicates
11+
- previously-defined custom predicates
12+
- other predicates in the same group.
13+
14+
Predicate hashing is recursive: in order to hash a group of custom predicates, we need to know IDs for all the previously-defined custom predicates it depends on.
15+
16+
The definition of the whole group of custom predicates is serialized (as explained below), and that serialization is hashed (using a zk-friendly hash -- in the case of the plonky2 backend, Poseidon) to give a _group ID_. Each predicate in the group is then referenced by
17+
```
18+
predicate_ID = (group_ID, idx)
19+
```
20+
(here `idx` is simply the index of the predicate in the group).
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Custom operations (or: how to define a custom predicate): VERSION 1
1+
# Defining custom predicates
22

33
(Note: At the moment, we consider a "custom operation" to be exactly the same thing as the "definition of a custom predicate.")
44

book/src/front_and_back.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,17 @@
11
# Frontend and backend
22

3-
The frontend is what we want the user to see.
4-
The backend is what we want the circuit to see.
3+
The POD2 system consists of a frontend and a backend, connected by a middleware. This page outlines some design principles for deciding which components go where.
4+
5+
```
6+
user -- frontend -- middleware -- backend -- ZK circuit
7+
```
8+
9+
The frontend is what we want the user to see; the backend is what we want the circuit to see.
10+
11+
## Circuit and proving system
12+
13+
The first implementation of POD2 uses Plonky2 as its proving system. In principle, a future implementation could use some other proving system. The frontend and middleware should not be aware of what proving system is in use: anything specific to the proving system belongs to the backend.
14+
15+
## User-facing types versus in-circuit types
16+
17+
The frontend type system exposes human-readable types to POD developers: strings, ints, bools, and so forth. On the backend, all types are build out of field elements. The middleware should handle the conversion.

book/src/values.md

Lines changed: 8 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
# POD value types
2-
From the frontend perspective, POD values may be one of the following[^type] types: two atomic types
2+
From the frontend perspective, POD values may be one of the following[^type] types: four atomic types
33
- `Integer`
4+
- `Bool`
45
- `String`
6+
- `Raw`
57

68
and three compound types
79
- `Dictionary`
@@ -18,31 +20,16 @@ $$\texttt{HashOut<GoldilocksField>}\simeq\texttt{[GoldilocksField; 4]}.$$
1820
## `Integer`
1921
In the frontend, this type is none other than `u64`[^i64]. In the backend, it will be appropriately embedded into the codomain of the canonical hash function.
2022

21-
In the case of the Plonky2 backend, this is done by decomposing such an integer $x$ as
22-
$$x = x_0 + x_1 \cdot 2^{32}$$
23-
with $0 \leq x_0, x_1 < 2^{32}$ and representing it as
24-
$$\texttt{map}\ \iota\ [x_0, x_1, 0, 0],$$
25-
where $\iota:\mathbb{N}\cup\{0\}\rightarrow\texttt{GoldilocksField}$ is the canonical projection.
26-
23+
## `Bool`
24+
In the frontend, this is a simple bool. In the backend, it will have the same encoding as an `Integer` `0` (for `false`) or `1` (for `true`).
2725

2826
## `String`
2927
In the frontend, this type corresponds to the usual `String`. In the backend, the string will be mapped to a sequence of field elements and hashed with the hash function employed there, thus being represented by its hash.
3028

31-
In the case of the Plonky2 backend, the string is converted to a sequence of bytes with the byte `0x01` appended as padding, then the bytes are split into 7-byte chunks starting from the left, these chunks then being interpreted as integers in little-endian form, each of which is naturally an element of `GoldilocksField`, whence the resulting sequence may be hashed via the Poseidon hash function. Symbolically, given a string $s$, its hash is defined by
32-
33-
$$\texttt{poseidon}(\texttt{map}\ (\iota\circ\jmath_\texttt{le-bytes->int})\ \texttt{chunks}_7(\jmath_\texttt{string->bytes}(s)\ \texttt{++}\ [\texttt{0x01}])),$$
34-
35-
where `poseidon` is the Poseidon instance used by Plonky2, $\iota$ is as above, $\texttt{chunks}_{n}:[\texttt{u8}]\rightarrow [[\texttt{u8}]]$ is defined such that[^aux]
36-
37-
$$\texttt{chunks}_n(v) = \textup{if}\ v = [\ ]\ \textup{then}\ [\ ]\ \textup{else}\ [\texttt{take}_n v]\ \texttt{++}\ \texttt{chunks}_n(\texttt{drop}_n v),$$
38-
39-
the mapping $\jmath_\texttt{le-bytes->int}: [u8] \rightarrow{N}\cup\{0\}$ is given by
40-
41-
$$[b_0,\dots,b_{N-1}]\mapsto \sum_{i=0}^{N-1} b_i \cdot 2^{8i},$$
42-
43-
and $\jmath_\texttt{string->bytes}$ is the canonical mapping of a string to its UTF-8 representation.
44-
29+
## `Raw`
30+
"Raw" is short for "raw value". A `Raw` exposes a backend value on the frontend.
4531

32+
With the plonky2 backend, a `Raw` is a tuple of 4 elements of the Goldilocks field.
4633

4734
## Dictionary, array, set
4835

src/backends/plonky2/mock_main/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -435,7 +435,7 @@ impl Pod for MockMainPod {
435435
self.operations[i]
436436
.deref(&self.statements[..input_statement_offset + i])
437437
.unwrap()
438-
.check(&s.clone().try_into().unwrap())
438+
.check(&self.params, &s.clone().try_into().unwrap())
439439
})
440440
.collect::<Result<Vec<_>>>()
441441
.unwrap();

src/frontend/custom.rs

Lines changed: 23 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
#![allow(unused)]
2+
use anyhow::Result;
23
use std::sync::Arc;
34

45
use crate::middleware::{
5-
hash_str, CustomPredicate, CustomPredicateBatch, Hash, HashOrWildcard, NativePredicate,
6+
hash_str, CustomPredicate, CustomPredicateBatch, Hash, HashOrWildcard, NativePredicate, Params,
67
Predicate, StatementTmpl, StatementTmplArg, ToFields, Value, F,
78
};
89

@@ -96,31 +97,34 @@ impl CustomPredicateBatchBuilder {
9697

9798
fn predicate_and(
9899
&mut self,
100+
params: &Params,
99101
args: &[&str],
100102
priv_args: &[&str],
101103
sts: &[StatementTmplBuilder],
102-
) -> Predicate {
103-
self.predicate(true, args, priv_args, sts)
104+
) -> Result<Predicate> {
105+
self.predicate(params, true, args, priv_args, sts)
104106
}
105107

106108
fn predicate_or(
107109
&mut self,
110+
params: &Params,
108111
args: &[&str],
109112
priv_args: &[&str],
110113
sts: &[StatementTmplBuilder],
111-
) -> Predicate {
112-
self.predicate(false, args, priv_args, sts)
114+
) -> Result<Predicate> {
115+
self.predicate(params, false, args, priv_args, sts)
113116
}
114117

115118
/// creates the custom predicate from the given input, adds it to the
116119
/// self.predicates, and returns the index of the created predicate
117120
fn predicate(
118121
&mut self,
122+
params: &Params,
119123
conjunction: bool,
120124
args: &[&str],
121125
priv_args: &[&str],
122126
sts: &[StatementTmplBuilder],
123-
) -> Predicate {
127+
) -> Result<Predicate> {
124128
let statements = sts
125129
.iter()
126130
.map(|sb| {
@@ -138,13 +142,9 @@ impl CustomPredicateBatchBuilder {
138142
StatementTmpl(sb.predicate.clone(), args)
139143
})
140144
.collect();
141-
let custom_predicate = CustomPredicate {
142-
conjunction,
143-
statements,
144-
args_len: args.len(),
145-
};
145+
let custom_predicate = CustomPredicate::new(params, conjunction, statements, args.len())?;
146146
self.predicates.push(custom_predicate);
147-
Predicate::BatchSelf(self.predicates.len() - 1)
147+
Ok(Predicate::BatchSelf(self.predicates.len() - 1))
148148
}
149149

150150
fn finish(self) -> Arc<CustomPredicateBatch> {
@@ -174,7 +174,7 @@ mod tests {
174174
use crate::middleware::{CustomPredicateRef, Params, PodType};
175175

176176
#[test]
177-
fn test_custom_pred() {
177+
fn test_custom_pred() -> Result<()> {
178178
use NativePredicate as NP;
179179
use StatementTmplBuilder as STB;
180180

@@ -183,6 +183,7 @@ mod tests {
183183

184184
let mut builder = CustomPredicateBatchBuilder::new("eth_friend".into());
185185
let _eth_friend = builder.predicate_and(
186+
&params,
186187
// arguments:
187188
&["src_ori", "src_key", "dst_ori", "dst_key"],
188189
// private arguments:
@@ -202,7 +203,7 @@ mod tests {
202203
.arg(("attestation_pod", literal("attestation")))
203204
.arg(("dst_ori", "dst_key")),
204205
],
205-
);
206+
)?;
206207

207208
println!("a.0. eth_friend = {}", builder.predicates.last().unwrap());
208209
let eth_friend = builder.finish();
@@ -216,6 +217,7 @@ mod tests {
216217
// >
217218
let mut builder = CustomPredicateBatchBuilder::new("eth_dos_distance_base".into());
218219
let eth_dos_distance_base = builder.predicate_and(
220+
&params,
219221
&[
220222
// arguments:
221223
"src_ori",
@@ -236,7 +238,7 @@ mod tests {
236238
.arg(("distance_ori", "distance_key"))
237239
.arg(0),
238240
],
239-
);
241+
)?;
240242
println!(
241243
"b.0. eth_dos_distance_base = {}",
242244
builder.predicates.last().unwrap()
@@ -246,6 +248,7 @@ mod tests {
246248

247249
// next chunk builds:
248250
let eth_dos_distance_ind = builder.predicate_and(
251+
&params,
249252
&[
250253
// arguments:
251254
"src_ori",
@@ -281,14 +284,15 @@ mod tests {
281284
.arg(("intermed_ori", "intermed_key"))
282285
.arg(("dst_ori", "dst_key")),
283286
],
284-
);
287+
)?;
285288

286289
println!(
287290
"b.1. eth_dos_distance_ind = {}",
288291
builder.predicates.last().unwrap()
289292
);
290293

291294
let _eth_dos_distance = builder.predicate_or(
295+
&params,
292296
&[
293297
"src_ori",
294298
"src_key",
@@ -308,7 +312,7 @@ mod tests {
308312
.arg(("dst_ori", "dst_key"))
309313
.arg(("distance_ori", "distance_key")),
310314
],
311-
);
315+
)?;
312316

313317
println!(
314318
"b.2. eth_dos_distance = {}",
@@ -318,5 +322,7 @@ mod tests {
318322
let eth_dos_batch_b = builder.finish();
319323
let fields = eth_dos_batch_b.to_fields(&params);
320324
println!("Batch b, serialized: {:?}", fields);
325+
326+
Ok(())
321327
}
322328
}

0 commit comments

Comments
 (0)